All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/pg-r4k.c: Dump the generated code
@ 2007-10-02 13:54 Maciej W. Rozycki
  2007-10-02 14:11 ` Thiemo Seufer
  0 siblings, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-02 13:54 UTC (permalink / raw)
  To: Ralf Baechle, Thiemo Seufer; +Cc: linux-mips

 Dump the generated code for clear/copy page calls like it is done for TLB 
fault handlers.  Useful for debugging.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
Thiemo,

 It was your change to add ".set noreorder", etc. to the TLB fault 
handlers -- what is it needed for?  I have thought gas does not try to 
outsmart the user at the moment and does not reorder ".word" directives.

 Ralf, please apply.

  Maciej

patch-mips-2.6.23-rc5-20070904-pg-r4k-dump-0
diff -up --recursive --new-file linux-mips-2.6.23-rc5-20070904.macro/arch/mips/mm/pg-r4k.c linux-mips-2.6.23-rc5-20070904/arch/mips/mm/pg-r4k.c
--- linux-mips-2.6.23-rc5-20070904.macro/arch/mips/mm/pg-r4k.c	2007-02-05 16:38:47.000000000 +0000
+++ linux-mips-2.6.23-rc5-20070904/arch/mips/mm/pg-r4k.c	2007-10-01 22:50:13.000000000 +0000
@@ -347,6 +347,7 @@ void __init build_clear_page(void)
 {
 	unsigned int loop_start;
 	unsigned long off;
+	int i;
 
 	epc = (unsigned int *) &clear_page_array;
 	instruction_pending = 0;
@@ -434,12 +435,22 @@ dest = label();
 	build_jr_ra();
 
 	BUG_ON(epc > clear_page_array + ARRAY_SIZE(clear_page_array));
+
+	pr_info("Synthesized clear page handler (%u instructions).\n",
+		(unsigned int)(epc - clear_page_array));
+
+	pr_debug("\t.set push\n");
+	pr_debug("\t.set noreorder\n");
+	for (i = 0; i < (epc - clear_page_array); i++)
+		pr_debug("\t.word 0x%08x\n", clear_page_array[i]);
+	pr_debug("\t.set pop\n");
 }
 
 void __init build_copy_page(void)
 {
 	unsigned int loop_start;
 	unsigned long off;
+	int i;
 
 	epc = (unsigned int *) &copy_page_array;
 	store_offset = load_offset = 0;
@@ -515,4 +526,13 @@ dest = label();
 	build_jr_ra();
 
 	BUG_ON(epc > copy_page_array + ARRAY_SIZE(copy_page_array));
+
+	pr_info("Synthesized copy page handler (%u instructions).\n",
+		(unsigned int)(epc - copy_page_array));
+
+	pr_debug("\t.set push\n");
+	pr_debug("\t.set noreorder\n");
+	for (i = 0; i < (epc - copy_page_array); i++)
+		pr_debug("\t.word 0x%08x\n", copy_page_array[i]);
+	pr_debug("\t.set pop\n");
 }

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-02 13:54 [PATCH] mm/pg-r4k.c: Dump the generated code Maciej W. Rozycki
@ 2007-10-02 14:11 ` Thiemo Seufer
  2007-10-02 15:49   ` Ralf Baechle
  0 siblings, 1 reply; 93+ messages in thread
From: Thiemo Seufer @ 2007-10-02 14:11 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips

Maciej W. Rozycki wrote:
>  Dump the generated code for clear/copy page calls like it is done for TLB 
> fault handlers.  Useful for debugging.
> 
> Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
> ---
> Thiemo,
> 
>  It was your change to add ".set noreorder", etc. to the TLB fault 
> handlers -- what is it needed for?  I have thought gas does not try to 
> outsmart the user at the moment and does not reorder ".word" directives.

It is not strictly needed, but it is a hint to the user that he looks
at raw instructions.


Thiemo

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-02 14:11 ` Thiemo Seufer
@ 2007-10-02 15:49   ` Ralf Baechle
  2007-10-02 16:03     ` Thiemo Seufer
                       ` (2 more replies)
  0 siblings, 3 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-02 15:49 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Maciej W. Rozycki, linux-mips

On Tue, Oct 02, 2007 at 03:11:26PM +0100, Thiemo Seufer wrote:

> Maciej W. Rozycki wrote:
> >  Dump the generated code for clear/copy page calls like it is done for TLB 
> > fault handlers.  Useful for debugging.
> > 
> > Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
> > ---
> > Thiemo,
> > 
> >  It was your change to add ".set noreorder", etc. to the TLB fault 
> > handlers -- what is it needed for?  I have thought gas does not try to 
> > outsmart the user at the moment and does not reorder ".word" directives.
> 
> It is not strictly needed, but it is a hint to the user that he looks
> at raw instructions.

I have a patch which makes the generated code accessible through a
procfs file.  That can easily be converted back into a .o file and then
be disassembled.  So it's now a question of which variant is preferable.

I don't mind - it's just that I've never been a friend of leaving much
debugging code or features around.  99% of the time it is just make the
code harder to read and maintain.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-02 15:49   ` Ralf Baechle
@ 2007-10-02 16:03     ` Thiemo Seufer
  2007-10-02 16:08     ` Maciej W. Rozycki
  2007-10-03 12:17     ` Franck Bui-Huu
  2 siblings, 0 replies; 93+ messages in thread
From: Thiemo Seufer @ 2007-10-02 16:03 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Maciej W. Rozycki, linux-mips

Ralf Baechle wrote:
> On Tue, Oct 02, 2007 at 03:11:26PM +0100, Thiemo Seufer wrote:
> 
> > Maciej W. Rozycki wrote:
> > >  Dump the generated code for clear/copy page calls like it is done for TLB 
> > > fault handlers.  Useful for debugging.
> > > 
> > > Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
> > > ---
> > > Thiemo,
> > > 
> > >  It was your change to add ".set noreorder", etc. to the TLB fault 
> > > handlers -- what is it needed for?  I have thought gas does not try to 
> > > outsmart the user at the moment and does not reorder ".word" directives.
> > 
> > It is not strictly needed, but it is a hint to the user that he looks
> > at raw instructions.
> 
> I have a patch which makes the generated code accessible through a
> procfs file.  That can easily be converted back into a .o file and then
> be disassembled.  So it's now a question of which variant is preferable.

I prefer output at startup. If you are interested in the disassembly you
probably don't have access to /proc. :-)


Thiemo

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-02 15:49   ` Ralf Baechle
  2007-10-02 16:03     ` Thiemo Seufer
@ 2007-10-02 16:08     ` Maciej W. Rozycki
  2007-10-03  1:00       ` Ralf Baechle
  2007-10-03 12:17     ` Franck Bui-Huu
  2 siblings, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-02 16:08 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, linux-mips

On Tue, 2 Oct 2007, Ralf Baechle wrote:

> I have a patch which makes the generated code accessible through a
> procfs file.  That can easily be converted back into a .o file and then
> be disassembled.  So it's now a question of which variant is preferable.

 There is no need to go through such hassle even:

$ objdump -b binary -m mips:4000 -d /proc/foo

or suchlike should work (the program seems to be sensitive to the file 
size though, so it better be non-zero).

> I don't mind - it's just that I've never been a friend of leaving much
> debugging code or features around.  99% of the time it is just make the
> code harder to read and maintain.

 In this case I would let these bits stay in though.  The bootstrap log 
always works and can be captured with the serial console or read from the 
screen, and if there is a subtle breakage in these generated bits then the 
system may never get far enough for procfs to be accessible.  It is these 
moments it matters the most.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-02 16:08     ` Maciej W. Rozycki
@ 2007-10-03  1:00       ` Ralf Baechle
  2007-10-03  7:05         ` Geert Uytterhoeven
  0 siblings, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-03  1:00 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Thiemo Seufer, linux-mips

On Tue, Oct 02, 2007 at 05:08:05PM +0100, Maciej W. Rozycki wrote:

> > I have a patch which makes the generated code accessible through a
> > procfs file.  That can easily be converted back into a .o file and then
> > be disassembled.  So it's now a question of which variant is preferable.
> 
>  There is no need to go through such hassle even:
> 
> $ objdump -b binary -m mips:4000 -d /proc/foo
> 
> or suchlike should work (the program seems to be sensitive to the file 
> size though, so it better be non-zero).
> 
> > I don't mind - it's just that I've never been a friend of leaving much
> > debugging code or features around.  99% of the time it is just make the
> > code harder to read and maintain.
> 
>  In this case I would let these bits stay in though.  The bootstrap log 
> always works and can be captured with the serial console or read from the 
> screen, and if there is a subtle breakage in these generated bits then the 
> system may never get far enough for procfs to be accessible.  It is these 
> moments it matters the most.

I originally wrote my variant as a tool for optimization.

Anyway, queued for 2.6.24.  That is if 2.6.23 is ever released ;-)

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-03  1:00       ` Ralf Baechle
@ 2007-10-03  7:05         ` Geert Uytterhoeven
  2007-10-03 10:32           ` Ralf Baechle
  0 siblings, 1 reply; 93+ messages in thread
From: Geert Uytterhoeven @ 2007-10-03  7:05 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Maciej W. Rozycki, Thiemo Seufer, linux-mips

On Wed, 3 Oct 2007, Ralf Baechle wrote:
> Anyway, queued for 2.6.24.  That is if 2.6.23 is ever released ;-)

Any scripts relying on -rcX being single-digit? ;-)

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-03  7:05         ` Geert Uytterhoeven
@ 2007-10-03 10:32           ` Ralf Baechle
  0 siblings, 0 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-03 10:32 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Maciej W. Rozycki, Thiemo Seufer, linux-mips

On Wed, Oct 03, 2007 at 09:05:43AM +0200, Geert Uytterhoeven wrote:

> On Wed, 3 Oct 2007, Ralf Baechle wrote:
> > Anyway, queued for 2.6.24.  That is if 2.6.23 is ever released ;-)
> 
> Any scripts relying on -rcX being single-digit? ;-)

        if ($tag =~ /linux-([0-9]+\.[0-9]).*-.*/) {
                $final = "/pub/linux/mips/kernel/v$1/testing/$tag.tar.gz";
        } elsif ($tag =~ /linux-([0-9]+\.[0-9])/) {
                $final = "/pub/linux/mips/kernel/v$1/$tag.tar.gz";


No :-)

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-02 15:49   ` Ralf Baechle
  2007-10-02 16:03     ` Thiemo Seufer
  2007-10-02 16:08     ` Maciej W. Rozycki
@ 2007-10-03 12:17     ` Franck Bui-Huu
  2007-10-03 13:11       ` Thiemo Seufer
  2007-10-03 13:41       ` [PATCH] mm/pg-r4k.c: Dump the generated code Ralf Baechle
  2 siblings, 2 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-03 12:17 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

Ralf Baechle wrote:
> I don't mind - it's just that I've never been a friend of leaving much
> debugging code or features around.  99% of the time it is just make the
> code harder to read and maintain.
> 

Yeah this kind of code is really hard to follow and therefore hard to
maintain I guess.

I'm wondering if we couldn't try to implement such code generator by
using a tools/scripts during the build process. This tool could emit
the assembler code during the early phase of the build into an
assembler file and then it could compiled like any other one. I see a
3 main benefits:

  - It would simplify a lot the kernel code.
  - Decrease the size of the kernel
  - Easy to read the generated disassembly

One issue to deal with is that some instructions need to be emitted
according to the type of the cpu which can only be determined at run
time. In this case we could leave some rooms into the generated code
for additional instructions which could be filled/patched during the
boot time by using a 'patch table'. If the cpu doesn't need to patch
the generated code then the useless space would be discarded when
installing the handler in its final place.

Just a thought but I'm probably missing something.

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-03 12:17     ` Franck Bui-Huu
@ 2007-10-03 13:11       ` Thiemo Seufer
  2007-10-03 13:51         ` Maciej W. Rozycki
  2007-10-03 19:45         ` Franck Bui-Huu
  2007-10-03 13:41       ` [PATCH] mm/pg-r4k.c: Dump the generated code Ralf Baechle
  1 sibling, 2 replies; 93+ messages in thread
From: Thiemo Seufer @ 2007-10-03 13:11 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Ralf Baechle, Maciej W. Rozycki, linux-mips

Franck Bui-Huu wrote:
> Ralf Baechle wrote:
> > I don't mind - it's just that I've never been a friend of leaving much
> > debugging code or features around.  99% of the time it is just make the
> > code harder to read and maintain.
> > 
> 
> Yeah this kind of code is really hard to follow and therefore hard to
> maintain I guess.
> 
> I'm wondering if we couldn't try to implement such code generator by
> using a tools/scripts during the build process.
>
> This tool could emit
> the assembler code during the early phase of the build into an
> assembler file and then it could compiled like any other one. I see a
> 3 main benefits:
> 
>   - It would simplify a lot the kernel code.
>   - Decrease the size of the kernel
>   - Easy to read the generated disassembly
> 
> One issue to deal with is that some instructions need to be emitted
> according to the type of the cpu which can only be determined at run
> time. In this case we could leave some rooms into the generated code
> for additional instructions which could be filled/patched during the
> boot time by using a 'patch table'. If the cpu doesn't need to patch
> the generated code then the useless space would be discarded when
> installing the handler in its final place.

Then you have the worst of both approaches: The nicely readable
disassembly will change under you feet, and you still need relocation
annotations etc. for CPU-specific fixups. The end-result is likely
more complicated and opaque than what we have now.


Thiemo

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-03 12:17     ` Franck Bui-Huu
  2007-10-03 13:11       ` Thiemo Seufer
@ 2007-10-03 13:41       ` Ralf Baechle
  1 sibling, 0 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-03 13:41 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

On Wed, Oct 03, 2007 at 02:17:56PM +0200, Franck Bui-Huu wrote:

> Ralf Baechle wrote:
> > I don't mind - it's just that I've never been a friend of leaving much
> > debugging code or features around.  99% of the time it is just make the
> > code harder to read and maintain.
> > 
> 
> Yeah this kind of code is really hard to follow and therefore hard to
> maintain I guess.
> 
> I'm wondering if we couldn't try to implement such code generator by
> using a tools/scripts during the build process. This tool could emit
> the assembler code during the early phase of the build into an
> assembler file and then it could compiled like any other one. I see a
> 3 main benefits:
> 
>   - It would simplify a lot the kernel code.
>   - Decrease the size of the kernel
>   - Easy to read the generated disassembly
> 
> One issue to deal with is that some instructions need to be emitted
> according to the type of the cpu which can only be determined at run
> time. In this case we could leave some rooms into the generated code
> for additional instructions which could be filled/patched during the
> boot time by using a 'patch table'. If the cpu doesn't need to patch
> the generated code then the useless space would be discarded when
> installing the handler in its final place.
> 
> Just a thought but I'm probably missing something.

We went for the runtime generation because this is about the only sane
way we can get support for the widest range of cores yet not compromise
on performance.  Maintaining the previous generation of that code which
was like a dozen variants of page clearing, copying and TLB exception
handlers was definately more tedious than this.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-03 13:11       ` Thiemo Seufer
@ 2007-10-03 13:51         ` Maciej W. Rozycki
  2007-10-03 19:45         ` Franck Bui-Huu
  1 sibling, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-03 13:51 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Franck Bui-Huu, Ralf Baechle, linux-mips

On Wed, 3 Oct 2007, Thiemo Seufer wrote:

> Then you have the worst of both approaches: The nicely readable
> disassembly will change under you feet, and you still need relocation
> annotations etc. for CPU-specific fixups. The end-result is likely
> more complicated and opaque than what we have now.

 Well, to be honest what we have now is very good.  One trouble at the 
beginning, just after we switched from the old approach, was limited 
ability to get at what really is generated and therefore tough time to 
determine what was going on if something was wrong.  With these debug 
dumps in place it is gone now too.

 There is one limitation though -- unlike with ready-writted assembly to 
debug this code you typically need to have a specific system that shows a 
problem.  If you do not have one chances are you can miss a condition 
somewhere and therefore the problem.  Once you have the right piece of 
hardware, debugging is easy -- it took me half of a day if not less to 
sort out all the issues with the R3000 TLB handlers that we had once I got 
my hands on a suitable system.

 And as with everything, there is still room for improvement though.  For 
example I have noticed for the 64-bit TLB refill handler the path for 
vmalloc()ed pages may fit entirely in half of the space available.  Which 
means whatever is emitted after "eret" may be shifted to the TLB refill 
space at 0x80000000 saving the branch from the XTLB space at its end.  
That is probably doable with reasonably little effort given that we have 
support for "relocations".

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-03 13:11       ` Thiemo Seufer
  2007-10-03 13:51         ` Maciej W. Rozycki
@ 2007-10-03 19:45         ` Franck Bui-Huu
  2007-10-03 20:18           ` Thiemo Seufer
  1 sibling, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-03 19:45 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Ralf Baechle, Maciej W. Rozycki, linux-mips

Thiemo Seufer wrote:
> 
> Then you have the worst of both approaches: The nicely readable
> disassembly will change under you feet, and you still need
> relocation annotations etc. for CPU-specific fixups. The end-result
> is likely more complicated and opaque than what we have now.

Let say we generate handlers with all possible cpu fixups. Very few
instructions would be removed so the disassembly should be quite
similar after patching. And by emitting some nice comments in the
generated code, it should be fairly obvious to get an idea of the
final code.

All fixups would be listed in a table with some flags to identify them
and a list of instructions which need to be relocated.

It seems to me that the kernel code would be much simpler than what we
have now. Regarding the script used to generate the assembly code, if
think it would be too.

Thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-03 19:45         ` Franck Bui-Huu
@ 2007-10-03 20:18           ` Thiemo Seufer
  2007-10-04  7:33             ` Franck Bui-Huu
  0 siblings, 1 reply; 93+ messages in thread
From: Thiemo Seufer @ 2007-10-03 20:18 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Ralf Baechle, Maciej W. Rozycki, linux-mips

Franck Bui-Huu wrote:
> Thiemo Seufer wrote:
> > 
> > Then you have the worst of both approaches: The nicely readable
> > disassembly will change under you feet, and you still need
> > relocation annotations etc. for CPU-specific fixups. The end-result
> > is likely more complicated and opaque than what we have now.
> 
> Let say we generate handlers with all possible cpu fixups. Very few
> instructions would be removed so the disassembly should be quite
> similar after patching.

No way. Just check the possible variations: 64bit, highmem, SMP,
and so on.

> And by emitting some nice comments in the
> generated code, it should be fairly obvious to get an idea of the
> final code.
> 
> All fixups would be listed in a table with some flags to identify them
> and a list of instructions which need to be relocated.

At that point you have invented something which effectively emits
the sourcecode for tlbex.c.

> It seems to me that the kernel code would be much simpler than what we
> have now. Regarding the script used to generate the assembly code, if
> think it would be too.

I doubt that.


Thiemo

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-03 20:18           ` Thiemo Seufer
@ 2007-10-04  7:33             ` Franck Bui-Huu
  2007-10-04 10:30               ` Maciej W. Rozycki
  2007-10-04 12:15               ` Ralf Baechle
  0 siblings, 2 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-04  7:33 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Ralf Baechle, Maciej W. Rozycki, linux-mips

Thiemo Seufer wrote:
> Franck Bui-Huu wrote:
>> Thiemo Seufer wrote:
>>> Then you have the worst of both approaches: The nicely readable
>>> disassembly will change under you feet, and you still need
>>> relocation annotations etc. for CPU-specific fixups. The end-result
>>> is likely more complicated and opaque than what we have now.
>> Let say we generate handlers with all possible cpu fixups. Very few
>> instructions would be removed so the disassembly should be quite
>> similar after patching.
> 
> No way. Just check the possible variations: 64bit, highmem, SMP,
> and so on.
> 

You just listed some variations that are known at compile time. What I
meant by "all possible cpu fixups" is all fixups for a specific cpu
which can be known only at runtime.

>> And by emitting some nice comments in the
>> generated code, it should be fairly obvious to get an idea of the
>> final code.
>>
>> All fixups would be listed in a table with some flags to identify them
>> and a list of instructions which need to be relocated.
> 
> At that point you have invented something which effectively emits
> the sourcecode for tlbex.c.
> 

Not really, I would say it's just an idea to remove tlbex.c from the
kernel code and to make it a tool called during compile time to
generate a handler skeleton which would be finalized by the kernel.

Thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04  7:33             ` Franck Bui-Huu
@ 2007-10-04 10:30               ` Maciej W. Rozycki
  2007-10-04 12:15               ` Ralf Baechle
  1 sibling, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-04 10:30 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Ralf Baechle, linux-mips

On Thu, 4 Oct 2007, Franck Bui-Huu wrote:

> Not really, I would say it's just an idea to remove tlbex.c from the
> kernel code and to make it a tool called during compile time to
> generate a handler skeleton which would be finalized by the kernel.

 Thanks for volunteering.  When you finally come up with an implementation 
of a solution that is much better than the current one I am absolutely 
sure it will be accepted eagerly.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04  7:33             ` Franck Bui-Huu
  2007-10-04 10:30               ` Maciej W. Rozycki
@ 2007-10-04 12:15               ` Ralf Baechle
  2007-10-04 15:01                 ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-04 12:15 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

On Thu, Oct 04, 2007 at 09:33:08AM +0200, Franck Bui-Huu wrote:

> Not really, I would say it's just an idea to remove tlbex.c from the
> kernel code and to make it a tool called during compile time to
> generate a handler skeleton which would be finalized by the kernel.

IRIX was assembling its TLB exception handler from a few such skeletons
or rather a few fractions.  That works reasonably well as long as there are
not too many variants - but Linux supports about anything on earth.
Another disadvantage of the IRIX approach was that the fragments are
written in assembler but the tacking together happens in C code so the
code is split in a somewhat unnatural way over a few files.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 12:15               ` Ralf Baechle
@ 2007-10-04 15:01                 ` Franck Bui-Huu
  2007-10-04 15:23                   ` Maciej W. Rozycki
  2007-10-05 11:51                   ` Ralf Baechle
  0 siblings, 2 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-04 15:01 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

Ralf Baechle wrote:
> On Thu, Oct 04, 2007 at 09:33:08AM +0200, Franck Bui-Huu wrote:
> 
>> Not really, I would say it's just an idea to remove tlbex.c from the
>> kernel code and to make it a tool called during compile time to
>> generate a handler skeleton which would be finalized by the kernel.
> 
> IRIX was assembling its TLB exception handler from a few such skeletons
> or rather a few fractions.  That works reasonably well as long as there are
> not too many variants - but Linux supports about anything on earth.
> Another disadvantage of the IRIX approach was that the fragments are
> written in assembler but the tacking together happens in C code so the
> code is split in a somewhat unnatural way over a few files.
> 

That's what I was thinking too. It may require a lot of (ugly ?)
tricks to link the whole thing together. And if the idea was
previously used and showed it was inferior than what we have now, it's
just a bad idea.

It's just a bit sad to see my TLB handler generated at each boot and
to embed the whole tlbex generator inside the kernel which is quite
big:

   $ mipsel-linux-size arch/mips/mm/tlbex.o
      text    data     bss     dec     hex filename
     10116    3904    1568   15588    3ce4 arch/mips/mm/tlbex.o

specially if my cpu doesn't have any bugs.

Maybe having, 2 default implementations in tlbex-r3k.S, tlbex-r4k.S
for good cpus (the ones which needn't any fixups at all) and otherwise
the tlbex.c is used. And with luck the majority of the cpus are
good...

OK, probably a bad idea (again) ...

Thanks
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 15:01                 ` Franck Bui-Huu
@ 2007-10-04 15:23                   ` Maciej W. Rozycki
  2007-10-04 15:30                     ` Ralf Baechle
  2007-10-05  8:03                     ` Franck Bui-Huu
  2007-10-05 11:51                   ` Ralf Baechle
  1 sibling, 2 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-04 15:23 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

On Thu, 4 Oct 2007, Franck Bui-Huu wrote:

> It's just a bit sad to see my TLB handler generated at each boot and
> to embed the whole tlbex generator inside the kernel which is quite
> big:
> 
>    $ mipsel-linux-size arch/mips/mm/tlbex.o
>       text    data     bss     dec     hex filename
>      10116    3904    1568   15588    3ce4 arch/mips/mm/tlbex.o
> 
> specially if my cpu doesn't have any bugs.

 Well, most systems are there to work and not to be rebooted repeatedly 
all the time. ;-)  All of tlbex.o is discarded after bootstrap.

> Maybe having, 2 default implementations in tlbex-r3k.S, tlbex-r4k.S
> for good cpus (the ones which needn't any fixups at all) and otherwise
> the tlbex.c is used. And with luck the majority of the cpus are
> good...

 Well, most of the differences are not due to CPU bugs, but different cp0 
hazards.  The MIPS32r2 and MIPS64r2 architecture specs introduce the "ehb" 
and "jr.hb" instructions to sort them out, but most of the processors we 
support predate them.

 The existence of the definitions in <asm/war.h> is there so that 
workarounds for CPU bugs are optimised away at the kernel build time if 
not activated.

 I agree the inclusion both R3k and R4k handlers at the same time even 
though any configuration predetermines which of the two is only going to 
be needed is a bit suboptimal indeed.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 15:23                   ` Maciej W. Rozycki
@ 2007-10-04 15:30                     ` Ralf Baechle
  2007-10-04 15:35                       ` Maciej W. Rozycki
  2007-10-05  8:03                     ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-04 15:30 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Thu, Oct 04, 2007 at 04:23:42PM +0100, Maciej W. Rozycki wrote:

>  The existence of the definitions in <asm/war.h> is there so that 
> workarounds for CPU bugs are optimised away at the kernel build time if 
> not activated.
> 
>  I agree the inclusion both R3k and R4k handlers at the same time even 
> though any configuration predetermines which of the two is only going to 
> be needed is a bit suboptimal indeed.

I guess one of the goals was to slowly clean up the stuff that forces us
to have different kernels for R2000 and R4000 class TLBs.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 15:30                     ` Ralf Baechle
@ 2007-10-04 15:35                       ` Maciej W. Rozycki
  2007-10-04 15:42                         ` Ralf Baechle
  0 siblings, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-04 15:35 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Thu, 4 Oct 2007, Ralf Baechle wrote:

> >  I agree the inclusion both R3k and R4k handlers at the same time even 
> > though any configuration predetermines which of the two is only going to 
> > be needed is a bit suboptimal indeed.
> 
> I guess one of the goals was to slowly clean up the stuff that forces us
> to have different kernels for R2000 and R4000 class TLBs.

 Well, we had a plan to support multiple systems with a "generic" kernel 
too; at least ones that have a compatible load address.  Which would help 
distributions create their bootstrap disks for example.  I have thought 
all of this got abandoned at one point, mostly due to the maintenance 
effort required to keep it going long-term.  The Alpha port did it many 
years ago, but they have a compatible bootstrap environment and their 
number of system variations is limited, especially as compared to ours.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 15:35                       ` Maciej W. Rozycki
@ 2007-10-04 15:42                         ` Ralf Baechle
  2007-10-04 17:34                           ` Maciej W. Rozycki
  0 siblings, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-04 15:42 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Thu, Oct 04, 2007 at 04:35:39PM +0100, Maciej W. Rozycki wrote:

> > >  I agree the inclusion both R3k and R4k handlers at the same time even 
> > > though any configuration predetermines which of the two is only going to 
> > > be needed is a bit suboptimal indeed.
> > 
> > I guess one of the goals was to slowly clean up the stuff that forces us
> > to have different kernels for R2000 and R4000 class TLBs.
> 
>  Well, we had a plan to support multiple systems with a "generic" kernel 
> too; at least ones that have a compatible load address.  Which would help 
> distributions create their bootstrap disks for example.  I have thought 
> all of this got abandoned at one point, mostly due to the maintenance 
> effort required to keep it going long-term.  The Alpha port did it many 
> years ago, but they have a compatible bootstrap environment and their 
> number of system variations is limited, especially as compared to ours.

Anything in excessive amounts is toxic and that includes compatibility.
A true MIPS generic kernel would be hard to do.  But we have kernels that
can support all variants of the Malta even though Malta has more CPU options
than any other system.  Or for your personal toy project, all DECs wouldn't
be too hard either, or?

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 15:42                         ` Ralf Baechle
@ 2007-10-04 17:34                           ` Maciej W. Rozycki
  2007-10-08 15:46                             ` Maciej W. Rozycki
  0 siblings, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-04 17:34 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Thu, 4 Oct 2007, Ralf Baechle wrote:

> Anything in excessive amounts is toxic and that includes compatibility.
> A true MIPS generic kernel would be hard to do.  But we have kernels that
> can support all variants of the Malta even though Malta has more CPU options

 Have the issues been fixed?  I recall there was a problem with FPU 
context switching which would not let a MIPS IV Malta kernel (needed for 
all the old QED CPU core cards) run with a MIPS32r2 core.

> than any other system.  Or for your personal toy project, all DECs wouldn't
> be too hard either, or?

 The DECs should be reletively easy if we finally managed to get rid of 
all the 64-bit-isms in the 32-bit kernel even if built for MIPS III or 
above.  Which, given the recent commitment to 32-bit cores is what I would 
actually expect.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 15:23                   ` Maciej W. Rozycki
  2007-10-04 15:30                     ` Ralf Baechle
@ 2007-10-05  8:03                     ` Franck Bui-Huu
  2007-10-05  9:09                       ` Geert Uytterhoeven
  2007-10-05 12:19                       ` Maciej W. Rozycki
  1 sibling, 2 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-05  8:03 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

Maciej W. Rozycki wrote:
> On Thu, 4 Oct 2007, Franck Bui-Huu wrote:
> 
>> It's just a bit sad to see my TLB handler generated at each boot and
>> to embed the whole tlbex generator inside the kernel which is quite
>> big:
>>
>>    $ mipsel-linux-size arch/mips/mm/tlbex.o
>>       text    data     bss     dec     hex filename
>>      10116    3904    1568   15588    3ce4 arch/mips/mm/tlbex.o
>>
>> specially if my cpu doesn't have any bugs.
> 
>  Well, most systems are there to work and not to be rebooted repeatedly 
> all the time. ;-)  All of tlbex.o is discarded after bootstrap.
> 

Yes, but some systems out there have some constraints on their boot time
and others have ones on their persistent storage device size.

>> Maybe having, 2 default implementations in tlbex-r3k.S, tlbex-r4k.S
>> for good cpus (the ones which needn't any fixups at all) and otherwise
>> the tlbex.c is used. And with luck the majority of the cpus are
>> good...
> 
>  Well, most of the differences are not due to CPU bugs, but different cp0 
> hazards.  The MIPS32r2 and MIPS64r2 architecture specs introduce the "ehb" 
> and "jr.hb" instructions to sort them out, but most of the processors we 
> support predate them.
> 
>  The existence of the definitions in <asm/war.h> is there so that 
> workarounds for CPU bugs are optimised away at the kernel build time if 
> not activated.
> 

Just to be sure I haven't missed anything, it seems that we _could_ generate
the whole tlb handler at compile time since the CPU type is known at that
time, no need to have any fixups at runtime, isn't it ?

		Franck
 

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-05  8:03                     ` Franck Bui-Huu
@ 2007-10-05  9:09                       ` Geert Uytterhoeven
  2007-10-08 15:02                         ` Franck Bui-Huu
  2007-10-05 12:19                       ` Maciej W. Rozycki
  1 sibling, 1 reply; 93+ messages in thread
From: Geert Uytterhoeven @ 2007-10-05  9:09 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Maciej W. Rozycki, Ralf Baechle, Thiemo Seufer, linux-mips

On Fri, 5 Oct 2007, Franck Bui-Huu wrote:
> Maciej W. Rozycki wrote:
> > On Thu, 4 Oct 2007, Franck Bui-Huu wrote:
> > 
> >> It's just a bit sad to see my TLB handler generated at each boot and
> >> to embed the whole tlbex generator inside the kernel which is quite
> >> big:
> >>
> >>    $ mipsel-linux-size arch/mips/mm/tlbex.o
> >>       text    data     bss     dec     hex filename
> >>      10116    3904    1568   15588    3ce4 arch/mips/mm/tlbex.o
> >>
> >> specially if my cpu doesn't have any bugs.
> > 
> >  Well, most systems are there to work and not to be rebooted repeatedly 
> > all the time. ;-)  All of tlbex.o is discarded after bootstrap.
> > 
> 
> Yes, but some systems out there have some constraints on their boot time
> and others have ones on their persistent storage device size.
> 
> >> Maybe having, 2 default implementations in tlbex-r3k.S, tlbex-r4k.S
> >> for good cpus (the ones which needn't any fixups at all) and otherwise
> >> the tlbex.c is used. And with luck the majority of the cpus are
> >> good...
> > 
> >  Well, most of the differences are not due to CPU bugs, but different cp0 
> > hazards.  The MIPS32r2 and MIPS64r2 architecture specs introduce the "ehb" 
> > and "jr.hb" instructions to sort them out, but most of the processors we 
> > support predate them.
> > 
> >  The existence of the definitions in <asm/war.h> is there so that 
> > workarounds for CPU bugs are optimised away at the kernel build time if 
> > not activated.
> 
> Just to be sure I haven't missed anything, it seems that we _could_ generate
> the whole tlb handler at compile time since the CPU type is known at that
> time, no need to have any fixups at runtime, isn't it ?

For specialized systems, you can always introduce the option to generate
the TLB handler at compile time:
  - Enhance tlbex.c to be able to compile it for the host, and generate
    a fixed TLB handler, based on CONFIG_* options, if
    CONFIG_STATIC_TLB_HANDLER (buried deep in depends on EMBEDDED &&
    ADVANCED && I_KNOW_WHAT_I_AM_DOING) is set.
  - Let the dynamic runtime generator print the required CONFIG_*
    options for the system it runs on, so you know which one to set in
    your .config (a bit like calibrate_delay() prints the lpj=N value to
    pass to avoid calibrating the delay loop)

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 15:01                 ` Franck Bui-Huu
  2007-10-04 15:23                   ` Maciej W. Rozycki
@ 2007-10-05 11:51                   ` Ralf Baechle
  2007-10-08 14:11                     ` Franck Bui-Huu
  2007-10-09 20:33                     ` Franck Bui-Huu
  1 sibling, 2 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-05 11:51 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

On Thu, Oct 04, 2007 at 05:01:32PM +0200, Franck Bui-Huu wrote:

(Hitting the send key now so nobody notices I wrote this email at 3am ;-)

> It's just a bit sad to see my TLB handler generated at each boot and
> to embed the whole tlbex generator inside the kernel which is quite
> big:
> 
>    $ mipsel-linux-size arch/mips/mm/tlbex.o
>       text    data     bss     dec     hex filename
>      10116    3904    1568   15588    3ce4 arch/mips/mm/tlbex.o
> 
> specially if my cpu doesn't have any bugs.

So I did a few experiments.  This is the size of tlbex for a malta_defconfig
build with gcc 4.2.1:

   text    data     bss     dec     hex filename
  10468    3904    1568   15940    3e44 arch/mips/mm/tlbex.o

After replacing current_cpu_data.cputype with a new macro current_cpu_type
that expands to the constant CPU type value, I picked CPU_4KC:

   text    data     bss     dec     hex filename
   6088    3904    1568   11560    2d28 arch/mips/mm/tlbex.o

And after also changing r45k_bvahwbug, r4k_250MHZhwbug, bcm1250_m3_war,
r10000_llsc_war and m4kc_tlbp_war into inline functions:

   text    data     bss     dec     hex filename
   5608    3904    1568   11080    2b48 arch/mips/mm/tlbex.o

So I applied the inlining change to the queue tree and came up with a
generalized version of the current_cpu_type.   This are the sizes I get
for a malta kernel without and with hardwiring the CPU type to 4Kc:

     text    data     bss     dec     hex filename
  3273876  142324  140944 3557144  364718 vmlinux
  3267048  142324  140944 3550316  362c6c vmlinux

6828 bytes isn't totally amazing but since the optimization is reasonable
clean I'm going to queue this one also.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-05  8:03                     ` Franck Bui-Huu
  2007-10-05  9:09                       ` Geert Uytterhoeven
@ 2007-10-05 12:19                       ` Maciej W. Rozycki
  2007-10-08 14:48                         ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-05 12:19 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

On Fri, 5 Oct 2007, Franck Bui-Huu wrote:

> Just to be sure I haven't missed anything, it seems that we _could_ generate
> the whole tlb handler at compile time since the CPU type is known at that
> time, no need to have any fixups at runtime, isn't it ?

 The exact CPU type is not known at the moment.  For example CPU_R4X00 and 
CPU_MIPS32_R1 cover whole ranges that have subtle differences.  It may be 
possible to provide all the variations as a selection to the user, but it 
may be unfeasible -- I don't know.  Compare what we have in 
arch/mips/Kconfig with <asm/cpu.h>.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-05 11:51                   ` Ralf Baechle
@ 2007-10-08 14:11                     ` Franck Bui-Huu
  2007-10-08 14:41                       ` Ralf Baechle
  2007-10-09 20:33                     ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-08 14:11 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

Ralf Baechle wrote:
> 6828 bytes isn't totally amazing but since the optimization is reasonable
> clean I'm going to queue this one also.
> 

Yes and maybe it worths to queue this on top of your patch ?

--- 8< ---

From: Franck Bui-Huu <fbuihuu@gmail.com>
Subject: [PATCH] Verify CPU type when it's hardwiring

Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/kernel/cpu-probe.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index 06448a9..cf0b566 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -817,6 +817,14 @@ __init void cpu_probe(void)
 	default:
 		c->cputype = CPU_UNKNOWN;
 	}
+
+	/*
+	 * Platform code can force the cpu type to optimize code
+	 * generation. In that case be sure the cpu type is correctly
+	 * manually setup otherwise it could trigger some nasty bugs.
+	 */
+	BUG_ON(current_cpu_type() != c->cputype);
+
 	if (c->options & MIPS_CPU_FPU) {
 		c->fpu_id = cpu_get_fpu_id();
 
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 14:11                     ` Franck Bui-Huu
@ 2007-10-08 14:41                       ` Ralf Baechle
  0 siblings, 0 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-08 14:41 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

On Mon, Oct 08, 2007 at 04:11:51PM +0200, Franck Bui-Huu wrote:

> Ralf Baechle wrote:
> > 6828 bytes isn't totally amazing but since the optimization is reasonable
> > clean I'm going to queue this one also.
> > 
> 
> Yes and maybe it worths to queue this on top of your patch ?

Well, if they're lucky enough they make it to the BUG_ON().  But for many
of the missconfiguration scenarios the sympthoms would be more subtle.

Queued for 2.6.24.  Thanks,

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-05 12:19                       ` Maciej W. Rozycki
@ 2007-10-08 14:48                         ` Franck Bui-Huu
  2007-10-08 15:24                           ` Ralf Baechle
  2007-10-08 15:39                           ` Maciej W. Rozycki
  0 siblings, 2 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-08 14:48 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

Maciej W. Rozycki wrote:
> The exact CPU type is not known at the moment.  For example CPU_R4X00 and 
> CPU_MIPS32_R1 cover whole ranges that have subtle differences.  It may be 
> possible to provide all the variations as a selection to the user, but it 
> may be unfeasible -- I don't know.  Compare what we have in 
> arch/mips/Kconfig with <asm/cpu.h>.
> 

OK, I see.

Well, having all cpu variations in Kconfig should be technically
possible. The user needs to know what exact cpu is running on which
doesn't sound impossible and we could add some sanity checkings to
ensure he doesn't messed up its configuration.

BTW, we could pass more cpu compiler options for optimization this
way. For example, when using a '4ksd' cpu, we currently can't pass
'-march=4ksd' to gcc since the cpu type used for it is 'mips32r2'. And
I guess it's true for all cpu types which cover a range of slightly
different processors (r4x00 comes in mind).

OTOH, I don't know if it can work on SMP: if the system needs 2
different implementations of the handler (I don't know if it can
happen though), we must be able to select 2 different cpu types in
Kconfig...

Do you see any other points that we should consider before trying to
use static handlers ? Some other cpu features influencing the tlb
handler generations and that can be found only at runtime ?

thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-05  9:09                       ` Geert Uytterhoeven
@ 2007-10-08 15:02                         ` Franck Bui-Huu
  2007-10-08 15:21                           ` Geert Uytterhoeven
  0 siblings, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-08 15:02 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Maciej W. Rozycki, Ralf Baechle, Thiemo Seufer, linux-mips

Geert Uytterhoeven wrote:
> For specialized systems, you can always introduce the option to generate
> the TLB handler at compile time:

What do you mean by "specialized system" ?

If for some platforms we could generate the TLB handlers at compile
time, we could do it for all platforms, specially if the handler only
depends on the cpu type, no ?

>   - Enhance tlbex.c to be able to compile it for the host, and generate
>     a fixed TLB handler, based on CONFIG_* options, if
>     CONFIG_STATIC_TLB_HANDLER (buried deep in depends on EMBEDDED &&
>     ADVANCED && I_KNOW_WHAT_I_AM_DOING) is set.

It may mean putting a lot of hacks in tlbex.c making it just a PITA to
enhance and to maintain. IMHO, just have a static TLB handler
generator is simpler specially if we don't need to patch the handler
later. But we need to be sure nothing can be discover at runtime only
for current and future supported cpus...

thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 15:02                         ` Franck Bui-Huu
@ 2007-10-08 15:21                           ` Geert Uytterhoeven
  2007-10-08 15:26                             ` Ralf Baechle
  2007-10-09 20:20                             ` Franck Bui-Huu
  0 siblings, 2 replies; 93+ messages in thread
From: Geert Uytterhoeven @ 2007-10-08 15:21 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Maciej W. Rozycki, Ralf Baechle, Thiemo Seufer, linux-mips

On Mon, 8 Oct 2007, Franck Bui-Huu wrote:
> Geert Uytterhoeven wrote:
> > For specialized systems, you can always introduce the option to generate
> > the TLB handler at compile time:
> 
> What do you mean by "specialized system" ?

Embedded.

> If for some platforms we could generate the TLB handlers at compile
> time, we could do it for all platforms, specially if the handler only
> depends on the cpu type, no ?

Can't you currently compile a kernel that run on e.g. all O2s,
irrespective of the actual CPU type?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 14:48                         ` Franck Bui-Huu
@ 2007-10-08 15:24                           ` Ralf Baechle
  2007-10-08 15:39                           ` Maciej W. Rozycki
  1 sibling, 0 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-08 15:24 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Maciej W. Rozycki, Thiemo Seufer, linux-mips

On Mon, Oct 08, 2007 at 04:48:41PM +0200, Franck Bui-Huu wrote:

> Maciej W. Rozycki wrote:
> > The exact CPU type is not known at the moment.  For example CPU_R4X00 and 
> > CPU_MIPS32_R1 cover whole ranges that have subtle differences.  It may be 
> > possible to provide all the variations as a selection to the user, but it 
> > may be unfeasible -- I don't know.  Compare what we have in 
> > arch/mips/Kconfig with <asm/cpu.h>.
> > 
> 
> OK, I see.
> 
> Well, having all cpu variations in Kconfig should be technically
> possible. The user needs to know what exact cpu is running on which
> doesn't sound impossible and we could add some sanity checkings to
> ensure he doesn't messed up its configuration.

I don't consider this much of a problem.  The machines which either
have one or multiple of the R4000 family or a mix of of R10000 family
processors simply shouldn't hardwire the CPU types.  The R4000 machines
can afford the few bytes of kernel executable and the R10000 machines
often come with ridiculous amounts of memory anyway.

> BTW, we could pass more cpu compiler options for optimization this
> way. For example, when using a '4ksd' cpu, we currently can't pass
> '-march=4ksd' to gcc since the cpu type used for it is 'mips32r2'. And
> I guess it's true for all cpu types which cover a range of slightly
> different processors (r4x00 comes in mind).
> 
> OTOH, I don't know if it can work on SMP: if the system needs 2
> different implementations of the handler (I don't know if it can
> happen though), we must be able to select 2 different cpu types in
> Kconfig...

The currently only multiprocessor systems which allow mixing of different
processors are the SGI machines and there we have the restriction to
at least the same family of processors, see above.  One which I sooner
or later expect to see is CMP systems with different clock rates per
processor.

> Do you see any other points that we should consider before trying to
> use static handlers ? Some other cpu features influencing the tlb
> handler generations and that can be found only at runtime ?

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 15:21                           ` Geert Uytterhoeven
@ 2007-10-08 15:26                             ` Ralf Baechle
  2007-10-09 20:20                             ` Franck Bui-Huu
  1 sibling, 0 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-08 15:26 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Franck Bui-Huu, Maciej W. Rozycki, Thiemo Seufer, linux-mips

On Mon, Oct 08, 2007 at 05:21:56PM +0200, Geert Uytterhoeven wrote:

> > If for some platforms we could generate the TLB handlers at compile
> > time, we could do it for all platforms, specially if the handler only
> > depends on the cpu type, no ?
> 
> Can't you currently compile a kernel that run on e.g. all O2s,
> irrespective of the actual CPU type?

Sortof.  There are O2s with R5000, RM523x, RM7000, R10000 and R12000
processors.  Supporting all from a single kernel would be trivial if
the R1x000 processors were not such bitches in non-coherent systems,
so the latter are still unsupported, not even in special kernel configs.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 14:48                         ` Franck Bui-Huu
  2007-10-08 15:24                           ` Ralf Baechle
@ 2007-10-08 15:39                           ` Maciej W. Rozycki
  2007-10-09 20:17                             ` Franck Bui-Huu
  2007-10-10  8:53                             ` Ralf Baechle
  1 sibling, 2 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-08 15:39 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

On Mon, 8 Oct 2007, Franck Bui-Huu wrote:

> Well, having all cpu variations in Kconfig should be technically
> possible. The user needs to know what exact cpu is running on which
> doesn't sound impossible and we could add some sanity checkings to
> ensure he doesn't messed up its configuration.

 As long as the user is indeed capable of knowing what the exact CPU type 
is.  I have been told replacing R4X00 with a choice like R4000, R4400, 
R4600, R4700 may already be too much of a hassle.

 Frankly I am not entirely confident much choice beyond the ISA level is 
actually a good idea.  We do have it, because lots of bits depend on 
preprocessor conditionals even though they not necessarily should.  There 
are probably some historical reasons too.  But essentially we have about 
eight ISA variations (I - IV and four MIPS Architecture ISAs) and about 
four privileged resource architecture variations (R2000, R6000, R4000, 
R8000); not all combinations making sense and some of the choices actually 
not supported at all.

 CPU variations matter performance-wise, but the use of "-mtune=" is 
irrelevant in this context.

> BTW, we could pass more cpu compiler options for optimization this
> way. For example, when using a '4ksd' cpu, we currently can't pass
> '-march=4ksd' to gcc since the cpu type used for it is 'mips32r2'. And
> I guess it's true for all cpu types which cover a range of slightly
> different processors (r4x00 comes in mind).

 What would be the gain for the kernel from using "-march=4ksd" rather 
than "-march=mips32r2"?

> OTOH, I don't know if it can work on SMP: if the system needs 2
> different implementations of the handler (I don't know if it can
> happen though), we must be able to select 2 different cpu types in
> Kconfig...

 I do not think we happen to handle this scenario -- the more interesting 
configurations that could benefit do not support the cp0.ebase register 
making per-CPU handlers quite a challenge (i.e. the cost would exceed the 
benefit).

> Do you see any other points that we should consider before trying to
> use static handlers ? Some other cpu features influencing the tlb
> handler generations and that can be found only at runtime ?

 What if you want to run a single kernel image regardless of the CPU 
installed in the system.  Rebuilding the kernel (or having to keep a large 
collection of binaries) just because you want to swap the CPU does not 
seem like a terribly attractive idea.  Some systems come with their CPU(s) 
on a daughtercard (each), you know...

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-04 17:34                           ` Maciej W. Rozycki
@ 2007-10-08 15:46                             ` Maciej W. Rozycki
  2007-10-08 16:41                               ` Ralf Baechle
  0 siblings, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-08 15:46 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Thu, 4 Oct 2007, Maciej W. Rozycki wrote:

> > than any other system.  Or for your personal toy project, all DECs wouldn't
> > be too hard either, or?
> 
>  The DECs should be reletively easy if we finally managed to get rid of 
> all the 64-bit-isms in the 32-bit kernel even if built for MIPS III or 
> above.  Which, given the recent commitment to 32-bit cores is what I would 
> actually expect.

 On the second thought though -- I am afraid <asm/stackframe.h> is still 
the big showstopper.  Or actually the design around it.  That does not 
mean it is undoable, but I shall defer it for now.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 15:46                             ` Maciej W. Rozycki
@ 2007-10-08 16:41                               ` Ralf Baechle
  2007-10-08 16:45                                 ` Maciej W. Rozycki
  0 siblings, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-08 16:41 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Mon, Oct 08, 2007 at 04:46:17PM +0100, Maciej W. Rozycki wrote:

> >  The DECs should be reletively easy if we finally managed to get rid of 
> > all the 64-bit-isms in the 32-bit kernel even if built for MIPS III or 
> > above.  Which, given the recent commitment to 32-bit cores is what I would 
> > actually expect.
> 
>  On the second thought though -- I am afraid <asm/stackframe.h> is still 
> the big showstopper.  Or actually the design around it.  That does not 
> mean it is undoable, but I shall defer it for now.

There will be a few more issues so I guess we best tackle this step by
step.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 16:41                               ` Ralf Baechle
@ 2007-10-08 16:45                                 ` Maciej W. Rozycki
  2007-10-08 16:53                                   ` Ralf Baechle
  0 siblings, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-08 16:45 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Mon, 8 Oct 2007, Ralf Baechle wrote:

> >  On the second thought though -- I am afraid <asm/stackframe.h> is still 
> > the big showstopper.  Or actually the design around it.  That does not 
> > mean it is undoable, but I shall defer it for now.
> 
> There will be a few more issues so I guess we best tackle this step by
> step.

 OK, you first!

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 16:45                                 ` Maciej W. Rozycki
@ 2007-10-08 16:53                                   ` Ralf Baechle
  0 siblings, 0 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-08 16:53 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Mon, Oct 08, 2007 at 05:45:44PM +0100, Maciej W. Rozycki wrote:

> > >  On the second thought though -- I am afraid <asm/stackframe.h> is still 
> > > the big showstopper.  Or actually the design around it.  That does not 
> > > mean it is undoable, but I shall defer it for now.
> > 
> > There will be a few more issues so I guess we best tackle this step by
> > step.
> 
>  OK, you first!

You got the R3000 :-)

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 15:39                           ` Maciej W. Rozycki
@ 2007-10-09 20:17                             ` Franck Bui-Huu
  2007-10-10 11:58                               ` Maciej W. Rozycki
  2007-10-10  8:53                             ` Ralf Baechle
  1 sibling, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:17 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

Maciej W. Rozycki wrote:
>  What would be the gain for the kernel from using "-march=4ksd" rather 
> than "-march=mips32r2"?
> 

It actually results in a kernel image ~30kbytes smaller for the former
case. It has been discussed sometimes ago on this list. I'm sorry but
I don't know why...

> 
>  What if you want to run a single kernel image regardless of the CPU 
> installed in the system.  Rebuilding the kernel (or having to keep a large 
> collection of binaries) just because you want to swap the CPU does not 
> seem like a terribly attractive idea.  Some systems come with their CPU(s) 
> on a daughtercard (each), you know...
> 

ok, I wasn't aware about this. You could have started by this point ;)

So now I think the right direction is to stick with tlbex.c and
make it smaller like Ralf did.

Thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 15:21                           ` Geert Uytterhoeven
  2007-10-08 15:26                             ` Ralf Baechle
@ 2007-10-09 20:20                             ` Franck Bui-Huu
  1 sibling, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:20 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Maciej W. Rozycki, Ralf Baechle, Thiemo Seufer, linux-mips

Geert Uytterhoeven wrote:
> Can't you currently compile a kernel that run on e.g. all O2s,
> irrespective of the actual CPU type?
> 

It seems so, I've just been teached about it actually. So I think
we just have to stick with tlbex.c and perhaps make it smaller...

Thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-05 11:51                   ` Ralf Baechle
  2007-10-08 14:11                     ` Franck Bui-Huu
@ 2007-10-09 20:33                     ` Franck Bui-Huu
  2007-10-09 20:34                       ` [PATCH 1/6] tlbex.c: Cleanup __init usages Franck Bui-Huu
                                         ` (5 more replies)
  1 sibling, 6 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:33 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

Ralf Baechle wrote:
> So I did a few experiments.  This is the size of tlbex for a malta_defconfig

I did too and it results into the patchset I'm going to send.

Basically it removes all arrays from the init.data section and make
them automatic variables. So it's pretty extreme and maybe if the
stack pressure is too high, we could balance it. This is done by patch
2,3,4.

   text    data     bss     dec     hex filename
   9840    3904    1568   15312    3bd0 arch/mips/mm/tlbex.o~before
   9776     576    1568   11920    2e90 arch/mips/mm/tlbex.o~after

While I was at it, I did some trivial cleanups witch patch 1,5,6.

Thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH 1/6] tlbex.c: Cleanup __init usages.
  2007-10-09 20:33                     ` Franck Bui-Huu
@ 2007-10-09 20:34                       ` Franck Bui-Huu
  2007-10-11 16:16                         ` Ralf Baechle
  2007-10-09 20:35                         ` Franck Bui-Huu
                                         ` (4 subsequent siblings)
  5 siblings, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:34 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips


Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |   96 +++++++++++++++++++++++++-------------------------
 1 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index a61246d..01b0961 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -66,7 +66,7 @@ static inline int __maybe_unused r10000_llsc_war(void)
  * why; it's not an issue caused by the core RTL.
  *
  */
-static __init int __attribute__((unused)) m4kc_tlbp_war(void)
+static int __init m4kc_tlbp_war(void)
 {
 	return (current_cpu_data.processor_id & 0xffff00) ==
 	       (PRID_COMP_MIPS | PRID_IMP_4KC);
@@ -140,7 +140,7 @@ struct insn {
 	 | (e) << RE_SH						\
 	 | (f) << FUNC_SH)
 
-static __initdata struct insn insn_table[] = {
+static struct insn insn_table[] __initdata = {
 	{ insn_addiu, M(addiu_op, 0, 0, 0, 0, 0), RS | RT | SIMM },
 	{ insn_addu, M(spec_op, 0, 0, 0, 0, addu_op), RS | RT | RD },
 	{ insn_and, M(spec_op, 0, 0, 0, 0, and_op), RS | RT | RD },
@@ -193,7 +193,7 @@ static __initdata struct insn insn_table[] = {
 
 #undef M
 
-static __init u32 build_rs(u32 arg)
+static u32 __init build_rs(u32 arg)
 {
 	if (arg & ~RS_MASK)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -201,7 +201,7 @@ static __init u32 build_rs(u32 arg)
 	return (arg & RS_MASK) << RS_SH;
 }
 
-static __init u32 build_rt(u32 arg)
+static u32 __init build_rt(u32 arg)
 {
 	if (arg & ~RT_MASK)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -209,7 +209,7 @@ static __init u32 build_rt(u32 arg)
 	return (arg & RT_MASK) << RT_SH;
 }
 
-static __init u32 build_rd(u32 arg)
+static u32 __init build_rd(u32 arg)
 {
 	if (arg & ~RD_MASK)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -217,7 +217,7 @@ static __init u32 build_rd(u32 arg)
 	return (arg & RD_MASK) << RD_SH;
 }
 
-static __init u32 build_re(u32 arg)
+static u32 __init build_re(u32 arg)
 {
 	if (arg & ~RE_MASK)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -225,7 +225,7 @@ static __init u32 build_re(u32 arg)
 	return (arg & RE_MASK) << RE_SH;
 }
 
-static __init u32 build_simm(s32 arg)
+static u32 __init build_simm(s32 arg)
 {
 	if (arg > 0x7fff || arg < -0x8000)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -233,7 +233,7 @@ static __init u32 build_simm(s32 arg)
 	return arg & 0xffff;
 }
 
-static __init u32 build_uimm(u32 arg)
+static u32 __init build_uimm(u32 arg)
 {
 	if (arg & ~IMM_MASK)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -241,7 +241,7 @@ static __init u32 build_uimm(u32 arg)
 	return arg & IMM_MASK;
 }
 
-static __init u32 build_bimm(s32 arg)
+static u32 __init build_bimm(s32 arg)
 {
 	if (arg > 0x1ffff || arg < -0x20000)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -252,7 +252,7 @@ static __init u32 build_bimm(s32 arg)
 	return ((arg < 0) ? (1 << 15) : 0) | ((arg >> 2) & 0x7fff);
 }
 
-static __init u32 build_jimm(u32 arg)
+static u32 __init build_jimm(u32 arg)
 {
 	if (arg & ~((JIMM_MASK) << 2))
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -260,7 +260,7 @@ static __init u32 build_jimm(u32 arg)
 	return (arg >> 2) & JIMM_MASK;
 }
 
-static __init u32 build_func(u32 arg)
+static u32 __init build_func(u32 arg)
 {
 	if (arg & ~FUNC_MASK)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -268,7 +268,7 @@ static __init u32 build_func(u32 arg)
 	return arg & FUNC_MASK;
 }
 
-static __init u32 build_set(u32 arg)
+static u32 __init build_set(u32 arg)
 {
 	if (arg & ~SET_MASK)
 		printk(KERN_WARNING "TLB synthesizer field overflow\n");
@@ -315,69 +315,69 @@ static void __init build_insn(u32 **buf, enum opcode opc, ...)
 }
 
 #define I_u1u2u3(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a,	\
+	static inline void i##op(u32 **buf, unsigned int a,	\
 	 	unsigned int b, unsigned int c)			\
 	{							\
 		build_insn(buf, insn##op, a, b, c);		\
 	}
 
 #define I_u2u1u3(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a,	\
+	static inline void i##op(u32 **buf, unsigned int a,	\
 	 	unsigned int b, unsigned int c)			\
 	{							\
 		build_insn(buf, insn##op, b, a, c);		\
 	}
 
 #define I_u3u1u2(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a,	\
+	static inline void i##op(u32 **buf, unsigned int a,	\
 	 	unsigned int b, unsigned int c)			\
 	{							\
 		build_insn(buf, insn##op, b, c, a);		\
 	}
 
 #define I_u1u2s3(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a,	\
+	static inline void i##op(u32 **buf, unsigned int a,	\
 	 	unsigned int b, signed int c)			\
 	{							\
 		build_insn(buf, insn##op, a, b, c);		\
 	}
 
 #define I_u2s3u1(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a,	\
+	static inline void i##op(u32 **buf, unsigned int a,	\
 	 	signed int b, unsigned int c)			\
 	{							\
 		build_insn(buf, insn##op, c, a, b);		\
 	}
 
 #define I_u2u1s3(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a,	\
+	static inline void i##op(u32 **buf, unsigned int a,	\
 	 	unsigned int b, signed int c)			\
 	{							\
 		build_insn(buf, insn##op, b, a, c);		\
 	}
 
 #define I_u1u2(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a,	\
+	static inline void i##op(u32 **buf, unsigned int a,	\
 	 	unsigned int b)					\
 	{							\
 		build_insn(buf, insn##op, a, b);		\
 	}
 
 #define I_u1s2(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a,	\
+	static inline void i##op(u32 **buf, unsigned int a,	\
 	 	signed int b)					\
 	{							\
 		build_insn(buf, insn##op, a, b);		\
 	}
 
 #define I_u1(op)						\
-	static inline void __init i##op(u32 **buf, unsigned int a)	\
+	static inline void i##op(u32 **buf, unsigned int a)	\
 	{							\
 		build_insn(buf, insn##op, a);			\
 	}
 
 #define I_0(op)							\
-	static inline void __init i##op(u32 **buf)		\
+	static inline void i##op(u32 **buf)		\
 	{							\
 		build_insn(buf, insn##op);			\
 	}
@@ -457,7 +457,7 @@ struct label {
 	enum label_id lab;
 };
 
-static __init void build_label(struct label **lab, u32 *addr,
+static void __init build_label(struct label **lab, u32 *addr,
 			       enum label_id l)
 {
 	(*lab)->addr = addr;
@@ -526,34 +526,34 @@ L_LA(_r3000_write_probe_fail)
 #define i_ehb(buf) i_sll(buf, 0, 0, 3)
 
 #ifdef CONFIG_64BIT
-static __init int __maybe_unused in_compat_space_p(long addr)
+static int __init __maybe_unused in_compat_space_p(long addr)
 {
 	/* Is this address in 32bit compat space? */
 	return (((addr) & 0xffffffff00000000L) == 0xffffffff00000000L);
 }
 
-static __init int __maybe_unused rel_highest(long val)
+static int __init __maybe_unused rel_highest(long val)
 {
 	return ((((val + 0x800080008000L) >> 48) & 0xffff) ^ 0x8000) - 0x8000;
 }
 
-static __init int __maybe_unused rel_higher(long val)
+static int __init __maybe_unused rel_higher(long val)
 {
 	return ((((val + 0x80008000L) >> 32) & 0xffff) ^ 0x8000) - 0x8000;
 }
 #endif
 
-static __init int rel_hi(long val)
+static int __init rel_hi(long val)
 {
 	return ((((val + 0x8000L) >> 16) & 0xffff) ^ 0x8000) - 0x8000;
 }
 
-static __init int rel_lo(long val)
+static int __init rel_lo(long val)
 {
 	return ((val & 0xffff) ^ 0x8000) - 0x8000;
 }
 
-static __init void i_LA_mostly(u32 **buf, unsigned int rs, long addr)
+static void __init i_LA_mostly(u32 **buf, unsigned int rs, long addr)
 {
 #ifdef CONFIG_64BIT
 	if (!in_compat_space_p(addr)) {
@@ -571,7 +571,7 @@ static __init void i_LA_mostly(u32 **buf, unsigned int rs, long addr)
 		i_lui(buf, rs, rel_hi(addr));
 }
 
-static __init void __maybe_unused i_LA(u32 **buf, unsigned int rs,
+static void __init __maybe_unused i_LA(u32 **buf, unsigned int rs,
 					     long addr)
 {
 	i_LA_mostly(buf, rs, addr);
@@ -589,7 +589,7 @@ struct reloc {
 	enum label_id lab;
 };
 
-static __init void r_mips_pc16(struct reloc **rel, u32 *addr,
+static void __init r_mips_pc16(struct reloc **rel, u32 *addr,
 			       enum label_id l)
 {
 	(*rel)->addr = addr;
@@ -614,7 +614,7 @@ static inline void __resolve_relocs(struct reloc *rel, struct label *lab)
 	}
 }
 
-static __init void resolve_relocs(struct reloc *rel, struct label *lab)
+static void __init resolve_relocs(struct reloc *rel, struct label *lab)
 {
 	struct label *l;
 
@@ -624,7 +624,7 @@ static __init void resolve_relocs(struct reloc *rel, struct label *lab)
 				__resolve_relocs(rel, l);
 }
 
-static __init void move_relocs(struct reloc *rel, u32 *first, u32 *end,
+static void __init move_relocs(struct reloc *rel, u32 *first, u32 *end,
 			       long off)
 {
 	for (; rel->lab != label_invalid; rel++)
@@ -632,7 +632,7 @@ static __init void move_relocs(struct reloc *rel, u32 *first, u32 *end,
 			rel->addr += off;
 }
 
-static __init void move_labels(struct label *lab, u32 *first, u32 *end,
+static void __init move_labels(struct label *lab, u32 *first, u32 *end,
 			       long off)
 {
 	for (; lab->lab != label_invalid; lab++)
@@ -640,7 +640,7 @@ static __init void move_labels(struct label *lab, u32 *first, u32 *end,
 			lab->addr += off;
 }
 
-static __init void copy_handler(struct reloc *rel, struct label *lab,
+static void __init copy_handler(struct reloc *rel, struct label *lab,
 				u32 *first, u32 *end, u32 *target)
 {
 	long off = (long)(target - first);
@@ -651,7 +651,7 @@ static __init void copy_handler(struct reloc *rel, struct label *lab,
 	move_labels(lab, first, end, off);
 }
 
-static __init int __maybe_unused insn_has_bdelay(struct reloc *rel,
+static int __init __maybe_unused insn_has_bdelay(struct reloc *rel,
 						       u32 *addr)
 {
 	for (; rel->lab != label_invalid; rel++) {
@@ -743,11 +743,11 @@ il_bgez(u32 **p, struct reloc **r, unsigned int reg, enum label_id l)
  * We deliberately chose a buffer size of 128, so we won't scribble
  * over anything important on overflow before we panic.
  */
-static __initdata u32 tlb_handler[128];
+static u32 tlb_handler[128] __initdata;
 
 /* simply assume worst case size for labels and relocs */
-static __initdata struct label labels[128];
-static __initdata struct reloc relocs[128];
+static struct label labels[128] __initdata;
+static struct reloc relocs[128] __initdata;
 
 /*
  * The R3000 TLB handler is simple.
@@ -801,7 +801,7 @@ static void __init build_r3000_tlb_refill_handler(void)
  * other one.To keep things simple, we first assume linear space,
  * then we relocate it to the final handler layout as needed.
  */
-static __initdata u32 final_handler[64];
+static u32 final_handler[64] __initdata;
 
 /*
  * Hazards
@@ -825,7 +825,7 @@ static __initdata u32 final_handler[64];
  *
  * As if we MIPS hackers wouldn't know how to nop pipelines happy ...
  */
-static __init void __maybe_unused build_tlb_probe_entry(u32 **p)
+static void __init __maybe_unused build_tlb_probe_entry(u32 **p)
 {
 	switch (current_cpu_type()) {
 	/* Found by experiment: R4600 v2.0 needs this, too.  */
@@ -849,7 +849,7 @@ static __init void __maybe_unused build_tlb_probe_entry(u32 **p)
  */
 enum tlb_write_entry { tlb_random, tlb_indexed };
 
-static __init void build_tlb_write_entry(u32 **p, struct label **l,
+static void __init build_tlb_write_entry(u32 **p, struct label **l,
 					 struct reloc **r,
 					 enum tlb_write_entry wmode)
 {
@@ -993,7 +993,7 @@ static __init void build_tlb_write_entry(u32 **p, struct label **l,
  * TMP and PTR are scratch.
  * TMP will be clobbered, PTR will hold the pmd entry.
  */
-static __init void
+static void __init
 build_get_pmde64(u32 **p, struct label **l, struct reloc **r,
 		 unsigned int tmp, unsigned int ptr)
 {
@@ -1054,7 +1054,7 @@ build_get_pmde64(u32 **p, struct label **l, struct reloc **r,
  * BVADDR is the faulting address, PTR is scratch.
  * PTR will hold the pgd for vmalloc.
  */
-static __init void
+static void __init
 build_get_pgd_vmalloc64(u32 **p, struct label **l, struct reloc **r,
 			unsigned int bvaddr, unsigned int ptr)
 {
@@ -1118,7 +1118,7 @@ build_get_pgd_vmalloc64(u32 **p, struct label **l, struct reloc **r,
  * TMP and PTR are scratch.
  * TMP will be clobbered, PTR will hold the pgd entry.
  */
-static __init void __maybe_unused
+static void __init __maybe_unused
 build_get_pgde32(u32 **p, unsigned int tmp, unsigned int ptr)
 {
 	long pgdc = (long)pgd_current;
@@ -1153,7 +1153,7 @@ build_get_pgde32(u32 **p, unsigned int tmp, unsigned int ptr)
 
 #endif /* !CONFIG_64BIT */
 
-static __init void build_adjust_context(u32 **p, unsigned int ctx)
+static void __init build_adjust_context(u32 **p, unsigned int ctx)
 {
 	unsigned int shift = 4 - (PTE_T_LOG2 + 1) + PAGE_SHIFT - 12;
 	unsigned int mask = (PTRS_PER_PTE / 2 - 1) << (PTE_T_LOG2 + 1);
@@ -1179,7 +1179,7 @@ static __init void build_adjust_context(u32 **p, unsigned int ctx)
 	i_andi(p, ctx, ctx, mask);
 }
 
-static __init void build_get_ptep(u32 **p, unsigned int tmp, unsigned int ptr)
+static void __init build_get_ptep(u32 **p, unsigned int tmp, unsigned int ptr)
 {
 	/*
 	 * Bug workaround for the Nevada. It seems as if under certain
@@ -1204,7 +1204,7 @@ static __init void build_get_ptep(u32 **p, unsigned int tmp, unsigned int ptr)
 	i_ADDU(p, ptr, ptr, tmp); /* add in offset */
 }
 
-static __init void build_update_entries(u32 **p, unsigned int tmp,
+static void __init build_update_entries(u32 **p, unsigned int tmp,
 					unsigned int ptep)
 {
 	/*

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
@ 2007-10-09 20:35                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:35 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips

This patch reduces the kernel image size by making these 2 arrays
automatic variables.

	tlbex.o~old  =>  tlbex.o
	 text:     9840     9812      -28  0%
	 data:     3904     1344    -2560 -65%
	  bss:     1568     1568        0  0%
	total:    15312    12724    -2588 -16%

It increases the stack pressure a lot (more than 2500 bytes) but
at this stage in the boot process, it shouldn't matter.

Futhermore the TLB handler generator code doesn't have any deep
call graph and probably won't.

Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |   32 ++++++++++++++------------------
 1 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index 01b0961..ae1bf81 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -745,10 +745,6 @@ il_bgez(u32 **p, struct reloc **r, unsigned int reg, enum label_id l)
  */
 static u32 tlb_handler[128] __initdata;
 
-/* simply assume worst case size for labels and relocs */
-static struct label labels[128] __initdata;
-static struct reloc relocs[128] __initdata;
-
 /*
  * The R3000 TLB handler is simple.
  */
@@ -1250,8 +1246,8 @@ static void __init build_update_entries(u32 **p, unsigned int tmp,
 static void __init build_r4000_tlb_refill_handler(void)
 {
 	u32 *p = tlb_handler;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[128], *l = labels;
+	struct reloc relocs[128], *r = relocs;
 	u32 *f;
 	unsigned int final_len;
 	int i;
@@ -1598,8 +1594,8 @@ build_r3000_tlbchange_handler_head(u32 **p, unsigned int pte,
 static void __init build_r3000_tlb_load_handler(void)
 {
 	u32 *p = handle_tlbl;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbl, 0, sizeof(handle_tlbl));
@@ -1633,8 +1629,8 @@ static void __init build_r3000_tlb_load_handler(void)
 static void __init build_r3000_tlb_store_handler(void)
 {
 	u32 *p = handle_tlbs;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbs, 0, sizeof(handle_tlbs));
@@ -1668,8 +1664,8 @@ static void __init build_r3000_tlb_store_handler(void)
 static void __init build_r3000_tlb_modify_handler(void)
 {
 	u32 *p = handle_tlbm;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbm, 0, sizeof(handle_tlbm));
@@ -1748,8 +1744,8 @@ build_r4000_tlbchange_handler_tail(u32 **p, struct label **l,
 static void __init build_r4000_tlb_load_handler(void)
 {
 	u32 *p = handle_tlbl;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbl, 0, sizeof(handle_tlbl));
@@ -1793,8 +1789,8 @@ static void __init build_r4000_tlb_load_handler(void)
 static void __init build_r4000_tlb_store_handler(void)
 {
 	u32 *p = handle_tlbs;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbs, 0, sizeof(handle_tlbs));
@@ -1829,8 +1825,8 @@ static void __init build_r4000_tlb_store_handler(void)
 static void __init build_r4000_tlb_modify_handler(void)
 {
 	u32 *p = handle_tlbm;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbm, 0, sizeof(handle_tlbm));
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
@ 2007-10-09 20:35                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:35 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips

This patch reduces the kernel image size by making these 2 arrays
automatic variables.

	tlbex.o~old  =>  tlbex.o
	 text:     9840     9812      -28  0%
	 data:     3904     1344    -2560 -65%
	  bss:     1568     1568        0  0%
	total:    15312    12724    -2588 -16%

It increases the stack pressure a lot (more than 2500 bytes) but
at this stage in the boot process, it shouldn't matter.

Futhermore the TLB handler generator code doesn't have any deep
call graph and probably won't.

Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |   32 ++++++++++++++------------------
 1 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index 01b0961..ae1bf81 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -745,10 +745,6 @@ il_bgez(u32 **p, struct reloc **r, unsigned int reg, enum label_id l)
  */
 static u32 tlb_handler[128] __initdata;
 
-/* simply assume worst case size for labels and relocs */
-static struct label labels[128] __initdata;
-static struct reloc relocs[128] __initdata;
-
 /*
  * The R3000 TLB handler is simple.
  */
@@ -1250,8 +1246,8 @@ static void __init build_update_entries(u32 **p, unsigned int tmp,
 static void __init build_r4000_tlb_refill_handler(void)
 {
 	u32 *p = tlb_handler;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[128], *l = labels;
+	struct reloc relocs[128], *r = relocs;
 	u32 *f;
 	unsigned int final_len;
 	int i;
@@ -1598,8 +1594,8 @@ build_r3000_tlbchange_handler_head(u32 **p, unsigned int pte,
 static void __init build_r3000_tlb_load_handler(void)
 {
 	u32 *p = handle_tlbl;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbl, 0, sizeof(handle_tlbl));
@@ -1633,8 +1629,8 @@ static void __init build_r3000_tlb_load_handler(void)
 static void __init build_r3000_tlb_store_handler(void)
 {
 	u32 *p = handle_tlbs;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbs, 0, sizeof(handle_tlbs));
@@ -1668,8 +1664,8 @@ static void __init build_r3000_tlb_store_handler(void)
 static void __init build_r3000_tlb_modify_handler(void)
 {
 	u32 *p = handle_tlbm;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbm, 0, sizeof(handle_tlbm));
@@ -1748,8 +1744,8 @@ build_r4000_tlbchange_handler_tail(u32 **p, struct label **l,
 static void __init build_r4000_tlb_load_handler(void)
 {
 	u32 *p = handle_tlbl;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbl, 0, sizeof(handle_tlbl));
@@ -1793,8 +1789,8 @@ static void __init build_r4000_tlb_load_handler(void)
 static void __init build_r4000_tlb_store_handler(void)
 {
 	u32 *p = handle_tlbs;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbs, 0, sizeof(handle_tlbs));
@@ -1829,8 +1825,8 @@ static void __init build_r4000_tlb_store_handler(void)
 static void __init build_r4000_tlb_modify_handler(void)
 {
 	u32 *p = handle_tlbm;
-	struct label *l = labels;
-	struct reloc *r = relocs;
+	struct label labels[FASTPATH_SIZE], *l = labels;
+	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
 	int i;
 
 	memset(handle_tlbm, 0, sizeof(handle_tlbm));
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 3/6] tlbex.c: remove tlb_handler[] from init.data section
@ 2007-10-09 20:36                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:36 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips

This patch makes it an automatic variable instead therefore it
still increases the stack pressure by 512 bytes.

It results in the following size decrease:

	tlbex.o~old  =>  tlbex.o
	 text:     9812     9780      -32  0%
	 data:     1344      832     -512 -38%
	  bss:     1568     1568        0  0%
	total:    12724    12180     -544 -4%

Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |   50 +++++++++++++++++++++++++++++---------------------
 1 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index ae1bf81..cbcb320 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -735,27 +735,23 @@ il_bgez(u32 **p, struct reloc **r, unsigned int reg, enum label_id l)
 # define GET_CONTEXT(buf, reg) i_MFC0(buf, reg, C0_CONTEXT)
 #endif
 
-/* The worst case length of the handler is around 18 instructions for
- * R3000-style TLBs and up to 63 instructions for R4000-style TLBs.
- * Maximum space available is 32 instructions for R3000 and 64
- * instructions for R4000.
- *
- * We deliberately chose a buffer size of 128, so we won't scribble
- * over anything important on overflow before we panic.
- */
-static u32 tlb_handler[128] __initdata;
-
 /*
  * The R3000 TLB handler is simple.
+ *
+ * The worst case length of the handler is around 18 instructions for
+ * R3000-style TLBs and the maximum space available for it is 32
+ * instructions.
+ *
+ * We deliberately chose a buffer size of 64, so we won't scribble
+ * over anything important on overflow before we panic.
  */
 static void __init build_r3000_tlb_refill_handler(void)
 {
+	u32 tlb_handler[64], *p = tlb_handler;
 	long pgdc = (long)pgd_current;
-	u32 *p;
 	int i;
 
 	memset(tlb_handler, 0, sizeof(tlb_handler));
-	p = tlb_handler;
 
 	i_mfc0(&p, K0, C0_BADVADDR);
 	i_lui(&p, K1, rel_hi(pgdc)); /* cp0 delay */
@@ -787,17 +783,19 @@ static void __init build_r3000_tlb_refill_handler(void)
 		pr_debug("\t.word 0x%08x\n", tlb_handler[i]);
 	pr_debug("\t.set pop\n");
 
-	memcpy((void *)ebase, tlb_handler, 0x80);
+	memcpy((void *)ebase, tlb_handler, 32);
 }
 
 /*
- * The R4000 TLB handler is much more complicated. We have two
- * consecutive handler areas with 32 instructions space each.
- * Since they aren't used at the same time, we can overflow in the
- * other one.To keep things simple, we first assume linear space,
- * then we relocate it to the final handler layout as needed.
+ * The R4000 TLB handler.
+ *
+ * The worst case length of the handler is up to 63 instructions for
+ * R4000-style TLBs and the maximum space available for it is 64
+ * instructions.
+ *
+ * We deliberately chose a buffer size of 128, so we won't scribble
+ * over anything important on overflow before we panic.
  */
-static u32 final_handler[64] __initdata;
 
 /*
  * Hazards
@@ -1243,9 +1241,19 @@ static void __init build_update_entries(u32 **p, unsigned int tmp,
 #endif
 }
 
+/*
+ * The R4000 TLB handler is much more complicated. We have two
+ * consecutive handler areas with 32 instructions space each.
+ * Since they aren't used at the same time, we can overflow in the
+ * other one.To keep things simple, we first assume linear space,
+ * then we relocate it to the final handler layout as needed.
+ */
+static u32 final_handler[64] __initdata;
+
+
 static void __init build_r4000_tlb_refill_handler(void)
 {
-	u32 *p = tlb_handler;
+	u32 tlb_handler[128], *p = tlb_handler;
 	struct label labels[128], *l = labels;
 	struct reloc relocs[128], *r = relocs;
 	u32 *f;
@@ -1365,7 +1373,7 @@ static void __init build_r4000_tlb_refill_handler(void)
 		pr_debug("\t.word 0x%08x\n", f[i]);
 	pr_debug("\t.set pop\n");
 
-	memcpy((void *)ebase, final_handler, 0x100);
+	memcpy((void *)ebase, final_handler, 64);
 }
 
 /*
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 3/6] tlbex.c: remove tlb_handler[] from init.data section
@ 2007-10-09 20:36                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:36 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips

This patch makes it an automatic variable instead therefore it
still increases the stack pressure by 512 bytes.

It results in the following size decrease:

	tlbex.o~old  =>  tlbex.o
	 text:     9812     9780      -32  0%
	 data:     1344      832     -512 -38%
	  bss:     1568     1568        0  0%
	total:    12724    12180     -544 -4%

Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |   50 +++++++++++++++++++++++++++++---------------------
 1 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index ae1bf81..cbcb320 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -735,27 +735,23 @@ il_bgez(u32 **p, struct reloc **r, unsigned int reg, enum label_id l)
 # define GET_CONTEXT(buf, reg) i_MFC0(buf, reg, C0_CONTEXT)
 #endif
 
-/* The worst case length of the handler is around 18 instructions for
- * R3000-style TLBs and up to 63 instructions for R4000-style TLBs.
- * Maximum space available is 32 instructions for R3000 and 64
- * instructions for R4000.
- *
- * We deliberately chose a buffer size of 128, so we won't scribble
- * over anything important on overflow before we panic.
- */
-static u32 tlb_handler[128] __initdata;
-
 /*
  * The R3000 TLB handler is simple.
+ *
+ * The worst case length of the handler is around 18 instructions for
+ * R3000-style TLBs and the maximum space available for it is 32
+ * instructions.
+ *
+ * We deliberately chose a buffer size of 64, so we won't scribble
+ * over anything important on overflow before we panic.
  */
 static void __init build_r3000_tlb_refill_handler(void)
 {
+	u32 tlb_handler[64], *p = tlb_handler;
 	long pgdc = (long)pgd_current;
-	u32 *p;
 	int i;
 
 	memset(tlb_handler, 0, sizeof(tlb_handler));
-	p = tlb_handler;
 
 	i_mfc0(&p, K0, C0_BADVADDR);
 	i_lui(&p, K1, rel_hi(pgdc)); /* cp0 delay */
@@ -787,17 +783,19 @@ static void __init build_r3000_tlb_refill_handler(void)
 		pr_debug("\t.word 0x%08x\n", tlb_handler[i]);
 	pr_debug("\t.set pop\n");
 
-	memcpy((void *)ebase, tlb_handler, 0x80);
+	memcpy((void *)ebase, tlb_handler, 32);
 }
 
 /*
- * The R4000 TLB handler is much more complicated. We have two
- * consecutive handler areas with 32 instructions space each.
- * Since they aren't used at the same time, we can overflow in the
- * other one.To keep things simple, we first assume linear space,
- * then we relocate it to the final handler layout as needed.
+ * The R4000 TLB handler.
+ *
+ * The worst case length of the handler is up to 63 instructions for
+ * R4000-style TLBs and the maximum space available for it is 64
+ * instructions.
+ *
+ * We deliberately chose a buffer size of 128, so we won't scribble
+ * over anything important on overflow before we panic.
  */
-static u32 final_handler[64] __initdata;
 
 /*
  * Hazards
@@ -1243,9 +1241,19 @@ static void __init build_update_entries(u32 **p, unsigned int tmp,
 #endif
 }
 
+/*
+ * The R4000 TLB handler is much more complicated. We have two
+ * consecutive handler areas with 32 instructions space each.
+ * Since they aren't used at the same time, we can overflow in the
+ * other one.To keep things simple, we first assume linear space,
+ * then we relocate it to the final handler layout as needed.
+ */
+static u32 final_handler[64] __initdata;
+
+
 static void __init build_r4000_tlb_refill_handler(void)
 {
-	u32 *p = tlb_handler;
+	u32 tlb_handler[128], *p = tlb_handler;
 	struct label labels[128], *l = labels;
 	struct reloc relocs[128], *r = relocs;
 	u32 *f;
@@ -1365,7 +1373,7 @@ static void __init build_r4000_tlb_refill_handler(void)
 		pr_debug("\t.word 0x%08x\n", f[i]);
 	pr_debug("\t.set pop\n");
 
-	memcpy((void *)ebase, final_handler, 0x100);
+	memcpy((void *)ebase, final_handler, 64);
 }
 
 /*
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 4/6] tlbex.c: remove final_handler[] from init.data section
@ 2007-10-09 20:37                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:37 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips

This patch uses 256 stack bytes and decreases the kernel image
of the same size.

Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |    5 +----
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index cbcb320..6991b89 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -1248,15 +1248,12 @@ static void __init build_update_entries(u32 **p, unsigned int tmp,
  * other one.To keep things simple, we first assume linear space,
  * then we relocate it to the final handler layout as needed.
  */
-static u32 final_handler[64] __initdata;
-
-
 static void __init build_r4000_tlb_refill_handler(void)
 {
 	u32 tlb_handler[128], *p = tlb_handler;
+	u32 final_handler[64], *f;
 	struct label labels[128], *l = labels;
 	struct reloc relocs[128], *r = relocs;
-	u32 *f;
 	unsigned int final_len;
 	int i;
 
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 4/6] tlbex.c: remove final_handler[] from init.data section
@ 2007-10-09 20:37                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:37 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips

This patch uses 256 stack bytes and decreases the kernel image
of the same size.

Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |    5 +----
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index cbcb320..6991b89 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -1248,15 +1248,12 @@ static void __init build_update_entries(u32 **p, unsigned int tmp,
  * other one.To keep things simple, we first assume linear space,
  * then we relocate it to the final handler layout as needed.
  */
-static u32 final_handler[64] __initdata;
-
-
 static void __init build_r4000_tlb_refill_handler(void)
 {
 	u32 tlb_handler[128], *p = tlb_handler;
+	u32 final_handler[64], *f;
 	struct label labels[128], *l = labels;
 	struct reloc relocs[128], *r = relocs;
-	u32 *f;
 	unsigned int final_len;
 	int i;
 
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 5/6] tlbex.c: cleanup debug code
@ 2007-10-09 20:38                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:38 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips


Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |   83 +++++++++++++++----------------------------------
 1 files changed, 26 insertions(+), 57 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index 6991b89..e725072 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -714,6 +714,22 @@ il_bgez(u32 **p, struct reloc **r, unsigned int reg, enum label_id l)
 	i_bgez(p, reg, 0);
 }
 
+/*
+ * For debug purposes.
+ */
+static inline void dump_handler(const u32 *handler, int count)
+{
+	int i;
+
+	pr_debug("\t.set push\n");
+	pr_debug("\t.set noreorder\n");
+
+	for (i = 0; i < count; i++)
+		pr_debug("\t%p\t.word 0x%08x\n", &handler[i], handler[i]);
+
+	pr_debug("\t.set pop\n");
+}
+
 /* The only general purpose registers allowed in TLB handlers. */
 #define K0		26
 #define K1		27
@@ -749,7 +765,6 @@ static void __init build_r3000_tlb_refill_handler(void)
 {
 	u32 tlb_handler[64], *p = tlb_handler;
 	long pgdc = (long)pgd_current;
-	int i;
 
 	memset(tlb_handler, 0, sizeof(tlb_handler));
 
@@ -777,13 +792,9 @@ static void __init build_r3000_tlb_refill_handler(void)
 	pr_info("Synthesized TLB refill handler (%u instructions).\n",
 		(unsigned int)(p - tlb_handler));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - tlb_handler); i++)
-		pr_debug("\t.word 0x%08x\n", tlb_handler[i]);
-	pr_debug("\t.set pop\n");
-
 	memcpy((void *)ebase, tlb_handler, 32);
+
+	dump_handler((u32 *)ebase, 32);
 }
 
 /*
@@ -1255,7 +1266,6 @@ static void __init build_r4000_tlb_refill_handler(void)
 	struct label labels[128], *l = labels;
 	struct reloc relocs[128], *r = relocs;
 	unsigned int final_len;
-	int i;
 
 	memset(tlb_handler, 0, sizeof(tlb_handler));
 	memset(labels, 0, sizeof(labels));
@@ -1357,20 +1367,9 @@ static void __init build_r4000_tlb_refill_handler(void)
 	pr_info("Synthesized TLB refill handler (%u instructions).\n",
 		final_len);
 
-	f = final_handler;
-#if defined(CONFIG_64BIT) && !defined(CONFIG_CPU_LOONGSON2)
-	if (final_len > 32)
-		final_len = 64;
-	else
-		f = final_handler + 32;
-#endif /* CONFIG_64BIT */
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < final_len; i++)
-		pr_debug("\t.word 0x%08x\n", f[i]);
-	pr_debug("\t.set pop\n");
-
 	memcpy((void *)ebase, final_handler, 64);
+
+	dump_handler((u32 *)ebase, 64);
 }
 
 /*
@@ -1601,7 +1600,6 @@ static void __init build_r3000_tlb_load_handler(void)
 	u32 *p = handle_tlbl;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbl, 0, sizeof(handle_tlbl));
 	memset(labels, 0, sizeof(labels));
@@ -1624,11 +1622,7 @@ static void __init build_r3000_tlb_load_handler(void)
 	pr_info("Synthesized TLB load handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbl));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbl); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbl[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbl, ARRAY_SIZE(handle_tlbl));
 }
 
 static void __init build_r3000_tlb_store_handler(void)
@@ -1636,7 +1630,6 @@ static void __init build_r3000_tlb_store_handler(void)
 	u32 *p = handle_tlbs;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbs, 0, sizeof(handle_tlbs));
 	memset(labels, 0, sizeof(labels));
@@ -1659,11 +1652,7 @@ static void __init build_r3000_tlb_store_handler(void)
 	pr_info("Synthesized TLB store handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbs));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbs); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbs[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbs, ARRAY_SIZE(handle_tlbs));
 }
 
 static void __init build_r3000_tlb_modify_handler(void)
@@ -1671,7 +1660,6 @@ static void __init build_r3000_tlb_modify_handler(void)
 	u32 *p = handle_tlbm;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbm, 0, sizeof(handle_tlbm));
 	memset(labels, 0, sizeof(labels));
@@ -1694,11 +1682,7 @@ static void __init build_r3000_tlb_modify_handler(void)
 	pr_info("Synthesized TLB modify handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbm));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbm); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbm[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbm, ARRAY_SIZE(handle_tlbm));
 }
 
 /*
@@ -1751,7 +1735,6 @@ static void __init build_r4000_tlb_load_handler(void)
 	u32 *p = handle_tlbl;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbl, 0, sizeof(handle_tlbl));
 	memset(labels, 0, sizeof(labels));
@@ -1784,11 +1767,7 @@ static void __init build_r4000_tlb_load_handler(void)
 	pr_info("Synthesized TLB load handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbl));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbl); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbl[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbl, ARRAY_SIZE(handle_tlbl));
 }
 
 static void __init build_r4000_tlb_store_handler(void)
@@ -1796,7 +1775,6 @@ static void __init build_r4000_tlb_store_handler(void)
 	u32 *p = handle_tlbs;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbs, 0, sizeof(handle_tlbs));
 	memset(labels, 0, sizeof(labels));
@@ -1820,11 +1798,7 @@ static void __init build_r4000_tlb_store_handler(void)
 	pr_info("Synthesized TLB store handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbs));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbs); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbs[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbs, ARRAY_SIZE(handle_tlbs));
 }
 
 static void __init build_r4000_tlb_modify_handler(void)
@@ -1832,7 +1806,6 @@ static void __init build_r4000_tlb_modify_handler(void)
 	u32 *p = handle_tlbm;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbm, 0, sizeof(handle_tlbm));
 	memset(labels, 0, sizeof(labels));
@@ -1857,11 +1830,7 @@ static void __init build_r4000_tlb_modify_handler(void)
 	pr_info("Synthesized TLB modify handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbm));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbm); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbm[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbm, ARRAY_SIZE(handle_tlbm));
 }
 
 void __init build_tlb_refill_handler(void)
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 5/6] tlbex.c: cleanup debug code
@ 2007-10-09 20:38                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:38 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips


Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |   83 +++++++++++++++----------------------------------
 1 files changed, 26 insertions(+), 57 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index 6991b89..e725072 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -714,6 +714,22 @@ il_bgez(u32 **p, struct reloc **r, unsigned int reg, enum label_id l)
 	i_bgez(p, reg, 0);
 }
 
+/*
+ * For debug purposes.
+ */
+static inline void dump_handler(const u32 *handler, int count)
+{
+	int i;
+
+	pr_debug("\t.set push\n");
+	pr_debug("\t.set noreorder\n");
+
+	for (i = 0; i < count; i++)
+		pr_debug("\t%p\t.word 0x%08x\n", &handler[i], handler[i]);
+
+	pr_debug("\t.set pop\n");
+}
+
 /* The only general purpose registers allowed in TLB handlers. */
 #define K0		26
 #define K1		27
@@ -749,7 +765,6 @@ static void __init build_r3000_tlb_refill_handler(void)
 {
 	u32 tlb_handler[64], *p = tlb_handler;
 	long pgdc = (long)pgd_current;
-	int i;
 
 	memset(tlb_handler, 0, sizeof(tlb_handler));
 
@@ -777,13 +792,9 @@ static void __init build_r3000_tlb_refill_handler(void)
 	pr_info("Synthesized TLB refill handler (%u instructions).\n",
 		(unsigned int)(p - tlb_handler));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - tlb_handler); i++)
-		pr_debug("\t.word 0x%08x\n", tlb_handler[i]);
-	pr_debug("\t.set pop\n");
-
 	memcpy((void *)ebase, tlb_handler, 32);
+
+	dump_handler((u32 *)ebase, 32);
 }
 
 /*
@@ -1255,7 +1266,6 @@ static void __init build_r4000_tlb_refill_handler(void)
 	struct label labels[128], *l = labels;
 	struct reloc relocs[128], *r = relocs;
 	unsigned int final_len;
-	int i;
 
 	memset(tlb_handler, 0, sizeof(tlb_handler));
 	memset(labels, 0, sizeof(labels));
@@ -1357,20 +1367,9 @@ static void __init build_r4000_tlb_refill_handler(void)
 	pr_info("Synthesized TLB refill handler (%u instructions).\n",
 		final_len);
 
-	f = final_handler;
-#if defined(CONFIG_64BIT) && !defined(CONFIG_CPU_LOONGSON2)
-	if (final_len > 32)
-		final_len = 64;
-	else
-		f = final_handler + 32;
-#endif /* CONFIG_64BIT */
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < final_len; i++)
-		pr_debug("\t.word 0x%08x\n", f[i]);
-	pr_debug("\t.set pop\n");
-
 	memcpy((void *)ebase, final_handler, 64);
+
+	dump_handler((u32 *)ebase, 64);
 }
 
 /*
@@ -1601,7 +1600,6 @@ static void __init build_r3000_tlb_load_handler(void)
 	u32 *p = handle_tlbl;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbl, 0, sizeof(handle_tlbl));
 	memset(labels, 0, sizeof(labels));
@@ -1624,11 +1622,7 @@ static void __init build_r3000_tlb_load_handler(void)
 	pr_info("Synthesized TLB load handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbl));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbl); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbl[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbl, ARRAY_SIZE(handle_tlbl));
 }
 
 static void __init build_r3000_tlb_store_handler(void)
@@ -1636,7 +1630,6 @@ static void __init build_r3000_tlb_store_handler(void)
 	u32 *p = handle_tlbs;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbs, 0, sizeof(handle_tlbs));
 	memset(labels, 0, sizeof(labels));
@@ -1659,11 +1652,7 @@ static void __init build_r3000_tlb_store_handler(void)
 	pr_info("Synthesized TLB store handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbs));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbs); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbs[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbs, ARRAY_SIZE(handle_tlbs));
 }
 
 static void __init build_r3000_tlb_modify_handler(void)
@@ -1671,7 +1660,6 @@ static void __init build_r3000_tlb_modify_handler(void)
 	u32 *p = handle_tlbm;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbm, 0, sizeof(handle_tlbm));
 	memset(labels, 0, sizeof(labels));
@@ -1694,11 +1682,7 @@ static void __init build_r3000_tlb_modify_handler(void)
 	pr_info("Synthesized TLB modify handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbm));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbm); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbm[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbm, ARRAY_SIZE(handle_tlbm));
 }
 
 /*
@@ -1751,7 +1735,6 @@ static void __init build_r4000_tlb_load_handler(void)
 	u32 *p = handle_tlbl;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbl, 0, sizeof(handle_tlbl));
 	memset(labels, 0, sizeof(labels));
@@ -1784,11 +1767,7 @@ static void __init build_r4000_tlb_load_handler(void)
 	pr_info("Synthesized TLB load handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbl));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbl); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbl[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbl, ARRAY_SIZE(handle_tlbl));
 }
 
 static void __init build_r4000_tlb_store_handler(void)
@@ -1796,7 +1775,6 @@ static void __init build_r4000_tlb_store_handler(void)
 	u32 *p = handle_tlbs;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbs, 0, sizeof(handle_tlbs));
 	memset(labels, 0, sizeof(labels));
@@ -1820,11 +1798,7 @@ static void __init build_r4000_tlb_store_handler(void)
 	pr_info("Synthesized TLB store handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbs));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbs); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbs[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbs, ARRAY_SIZE(handle_tlbs));
 }
 
 static void __init build_r4000_tlb_modify_handler(void)
@@ -1832,7 +1806,6 @@ static void __init build_r4000_tlb_modify_handler(void)
 	u32 *p = handle_tlbm;
 	struct label labels[FASTPATH_SIZE], *l = labels;
 	struct reloc relocs[FASTPATH_SIZE], *r = relocs;
-	int i;
 
 	memset(handle_tlbm, 0, sizeof(handle_tlbm));
 	memset(labels, 0, sizeof(labels));
@@ -1857,11 +1830,7 @@ static void __init build_r4000_tlb_modify_handler(void)
 	pr_info("Synthesized TLB modify handler fastpath (%u instructions).\n",
 		(unsigned int)(p - handle_tlbm));
 
-	pr_debug("\t.set push\n");
-	pr_debug("\t.set noreorder\n");
-	for (i = 0; i < (p - handle_tlbm); i++)
-		pr_debug("\t.word 0x%08x\n", handle_tlbm[i]);
-	pr_debug("\t.set pop\n");
+	dump_handler(handle_tlbm, ARRAY_SIZE(handle_tlbm));
 }
 
 void __init build_tlb_refill_handler(void)
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 6/6] tlbex.c: cleanup include files
@ 2007-10-09 20:39                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:39 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips


Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |    9 ---------
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index e725072..05dc390 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -19,20 +19,11 @@
  * (Condolences to Napoleon XIV)
  */
 
-#include <stdarg.h>
-
-#include <linux/mm.h>
 #include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/string.h>
-#include <linux/init.h>
 
-#include <asm/pgtable.h>
-#include <asm/cacheflush.h>
 #include <asm/mmu_context.h>
 #include <asm/inst.h>
 #include <asm/elf.h>
-#include <asm/smp.h>
 #include <asm/war.h>
 
 static inline int r45k_bvahwbug(void)
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 6/6] tlbex.c: cleanup include files
@ 2007-10-09 20:39                         ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-09 20:39 UTC (permalink / raw)
  Cc: Ralf Baechle, Thiemo Seufer, Maciej W. Rozycki, linux-mips


Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
---
 arch/mips/mm/tlbex.c |    9 ---------
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index e725072..05dc390 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -19,20 +19,11 @@
  * (Condolences to Napoleon XIV)
  */
 
-#include <stdarg.h>
-
-#include <linux/mm.h>
 #include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/string.h>
-#include <linux/init.h>
 
-#include <asm/pgtable.h>
-#include <asm/cacheflush.h>
 #include <asm/mmu_context.h>
 #include <asm/inst.h>
 #include <asm/elf.h>
-#include <asm/smp.h>
 #include <asm/war.h>
 
 static inline int r45k_bvahwbug(void)
-- 
1.5.3.3

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-08 15:39                           ` Maciej W. Rozycki
  2007-10-09 20:17                             ` Franck Bui-Huu
@ 2007-10-10  8:53                             ` Ralf Baechle
  2007-10-10 12:17                               ` Maciej W. Rozycki
  1 sibling, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-10  8:53 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Mon, Oct 08, 2007 at 04:39:38PM +0100, Maciej W. Rozycki wrote:

> > Well, having all cpu variations in Kconfig should be technically
> > possible. The user needs to know what exact cpu is running on which
> > doesn't sound impossible and we could add some sanity checkings to
> > ensure he doesn't messed up its configuration.
> 
>  As long as the user is indeed capable of knowing what the exact CPU type 
> is.  I have been told replacing R4X00 with a choice like R4000, R4400, 
> R4600, R4700 may already be too much of a hassle.

Four choices is too much; after all these four marketing names are really
just 4 variants of two fairly similar processors.  Doable?  Yes.  A useful
improvment?  I doubt, otoh users of those old machines count every cycle
by hand still ;-)

Another problem with the enormous and continuously growing number of
processors is that few users know about all the compatibility issues
between the choices offered in Kconfig.  Alot of bug reports were caused
for example because users took MIPS32 to mean 32-bit MIPS - but R3000
processors clearly disagree with that view ;-)

One of the things I'm trying to achieve is to get rid of all the use of
CONFIG_CPU_MIPS32_R1 and similar processor symbols in code coming to a
point where selection of one of those symbols in Kconfig only means to
optimize a kernel for the selected core without sacrificing compatibility.

(But of course the few machines that support processors with multiple ISAs
spoil that plan a little ..)

>  Frankly I am not entirely confident much choice beyond the ISA level is 
> actually a good idea.  We do have it, because lots of bits depend on 
> preprocessor conditionals even though they not necessarily should.  There 
> are probably some historical reasons too.  But essentially we have about 
> eight ISA variations (I - IV and four MIPS Architecture ISAs) and about 
> four privileged resource architecture variations (R2000, R6000, R4000, 
> R8000); not all combinations making sense and some of the choices actually 
> not supported at all.
> 
>  CPU variations matter performance-wise, but the use of "-mtune=" is 
> irrelevant in this context.
> 
> > BTW, we could pass more cpu compiler options for optimization this
> > way. For example, when using a '4ksd' cpu, we currently can't pass
> > '-march=4ksd' to gcc since the cpu type used for it is 'mips32r2'. And
> > I guess it's true for all cpu types which cover a range of slightly
> > different processors (r4x00 comes in mind).
> 
>  What would be the gain for the kernel from using "-march=4ksd" rather 
> than "-march=mips32r2"?

One looks fancier ;-)

> > OTOH, I don't know if it can work on SMP: if the system needs 2
> > different implementations of the handler (I don't know if it can
> > happen though), we must be able to select 2 different cpu types in
> > Kconfig...
> 
>  I do not think we happen to handle this scenario -- the more interesting 
> configurations that could benefit do not support the cp0.ebase register 
> making per-CPU handlers quite a challenge (i.e. the cost would exceed the 
> benefit).

It's doable but there is little point.  Ebase is an R2 feature and who
on earth would mix pre-R2 and R2 cores in a SOC now that R2 is established
for a few years?

> > Do you see any other points that we should consider before trying to
> > use static handlers ? Some other cpu features influencing the tlb
> > handler generations and that can be found only at runtime ?
> 
>  What if you want to run a single kernel image regardless of the CPU 
> installed in the system.  Rebuilding the kernel (or having to keep a large 
> collection of binaries) just because you want to swap the CPU does not 
> seem like a terribly attractive idea.  Some systems come with their CPU(s) 
> on a daughtercard (each), you know...

Or an FPGAs.  I can swap CPUs on my Malta from the other side of earth in
few seconds by downloading another bitfile.  And it's damn useful to be
able to use the same kernel binary, keeps another 10min from going down
the drain for just a rebuild.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-09 20:17                             ` Franck Bui-Huu
@ 2007-10-10 11:58                               ` Maciej W. Rozycki
  2007-10-10 12:08                                 ` [SPAM?] " Nigel Stephens
  2007-10-14 19:32                                 ` Franck Bui-Huu
  0 siblings, 2 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-10 11:58 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

On Tue, 9 Oct 2007, Franck Bui-Huu wrote:

> >  What would be the gain for the kernel from using "-march=4ksd" rather 
> > than "-march=mips32r2"?
> > 
> 
> It actually results in a kernel image ~30kbytes smaller for the former
> case. It has been discussed sometimes ago on this list. I'm sorry but
> I don't know why...

 Perhaps the pipeline description for the 4KSd CPU is different from the 
default for the MIPS32r2 ISA.  Barring a study of GCC sources, if that 
really troubles you, you could build the same version of the kernel with 
these options:

1. "-march=mips32r2"

2. "-march=4ksd"

3. "-march=mips32r2 -mtune=4ksd"

and compare the results.  I expect the results of #2 and #3 to be the same 
and it would just back up my suggestion about keeping CPU-specific 
optimisations separate from the CPU selection.  Please also note that our 
optimisation model is for speed (-O2) rather than size (-Os), so if 
"-mtune=4ksd" yields smaller code than "-mtune=mips32r2", it just means it 
is safe for this CPU to shrink code where appropriate without losing 
performance.  One obvious place for such a choice is the use of the 
hardware multiplier vs shifts and additions where one multiplicand is a 
constant.

> >  What if you want to run a single kernel image regardless of the CPU 
> > installed in the system.  Rebuilding the kernel (or having to keep a large 
> > collection of binaries) just because you want to swap the CPU does not 
> > seem like a terribly attractive idea.  Some systems come with their CPU(s) 
> > on a daughtercard (each), you know...
> > 
> 
> ok, I wasn't aware about this. You could have started by this point ;)

 Well, daughtercards for CPUs are so common for me -- the vast majority of 
MIPS-based systems I use have them -- that I have assumed, obviously 
incorrectly, that you see a benefit from such a rewrite of the TLB 
exception handlers which is large enough to justify the inconvenience of 
limiting the kernel to a given CPU card.

> So now I think the right direction is to stick with tlbex.c and
> make it smaller like Ralf did.

 That is certainly a good idea.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [SPAM?]  Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-10 11:58                               ` Maciej W. Rozycki
@ 2007-10-10 12:08                                 ` Nigel Stephens
  2007-10-11 12:01                                   ` Maciej W. Rozycki
  2007-10-14 19:32                                 ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Nigel Stephens @ 2007-10-10 12:08 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Franck Bui-Huu, Ralf Baechle, Thiemo Seufer, linux-mips



Maciej W. Rozycki wrote:
> On Tue, 9 Oct 2007, Franck Bui-Huu wrote:
>
>   
>>>  What would be the gain for the kernel from using "-march=4ksd" rather 
>>> than "-march=mips32r2"?
>>>
>>>       
>> It actually results in a kernel image ~30kbytes smaller for the former
>> case. It has been discussed sometimes ago on this list. I'm sorry but
>> I don't know why...
>>     
>
>  Perhaps the pipeline description for the 4KSd CPU is different from the 
> default for the MIPS32r2 ISA.  Barring a study of GCC sources, if that 
> really troubles you, you could build the same version of the kernel with 
> these options:
>
> 1. "-march=mips32r2"
>
> 2. "-march=4ksd"
>
> 3. "-march=mips32r2 -mtune=4ksd"
>
> and compare the results. 



>  I expect the results of #2 and #3 to be the same 
> and it would just back up my suggestion about keeping CPU-specific 
> optimisations separate from the CPU selection.  

Actually the -march=4ksd option will allow gcc to use of the SmartMIPS 
lwxs (indexed load) instruction, which could save a few instructions 
here and there.


> Please also note that our 
> optimisation model is for speed (-O2) rather than size (-Os), so if 
> "-mtune=4ksd" yields smaller code than "-mtune=mips32r2", it just means it 
> is safe for this CPU to shrink code where appropriate without losing 
> performance.  One obvious place for such a choice is the use of the 
> hardware multiplier vs shifts and additions where one multiplicand is a 
> constant.
>
>   

Yes, that's also worth testing.

Nigel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-10  8:53                             ` Ralf Baechle
@ 2007-10-10 12:17                               ` Maciej W. Rozycki
  0 siblings, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-10 12:17 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Wed, 10 Oct 2007, Ralf Baechle wrote:

> >  As long as the user is indeed capable of knowing what the exact CPU type 
> > is.  I have been told replacing R4X00 with a choice like R4000, R4400, 
> > R4600, R4700 may already be too much of a hassle.
> 
> Four choices is too much; after all these four marketing names are really
> just 4 variants of two fairly similar processors.  Doable?  Yes.  A useful
> improvment?  I doubt, otoh users of those old machines count every cycle
> by hand still ;-)

 Except from the note on cycle counting ;-) I do agree these days and 
about the only place that cares about the subtleties of the R4k models are 
the TLB handlers which we have now solved with the current approach.

> One of the things I'm trying to achieve is to get rid of all the use of
> CONFIG_CPU_MIPS32_R1 and similar processor symbols in code coming to a
> point where selection of one of those symbols in Kconfig only means to
> optimize a kernel for the selected core without sacrificing compatibility.

 Well, these options should really be used to select the "-march=" option 
only.  We have some places, such as <asm/stackframe.h>, where the 
dependency is tough to eliminate, but that is definitely the right 
direction.

> (But of course the few machines that support processors with multiple ISAs
> spoil that plan a little ..)

 Well, they have cp0.prid and cp0.config* registers at their disposal.

> >  I do not think we happen to handle this scenario -- the more interesting 
> > configurations that could benefit do not support the cp0.ebase register 
> > making per-CPU handlers quite a challenge (i.e. the cost would exceed the 
> > benefit).
> 
> It's doable but there is little point.  Ebase is an R2 feature and who
> on earth would mix pre-R2 and R2 cores in a SOC now that R2 is established
> for a few years?

 I have actually thought of one of your pet SGI machine setups -- where 
the CPUs are mixed and are either MIPS III or MIPS IV each.  I do not 
recall you mentioning the exception vector range of RAM being local to the 
CPU cards, so I am assuming the handlers are always shared.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
  2007-10-09 20:35                         ` Franck Bui-Huu
  (?)
@ 2007-10-10 14:27                         ` Ralf Baechle
  2007-10-10 16:17                           ` Maciej W. Rozycki
  -1 siblings, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-10 14:27 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

On Tue, Oct 09, 2007 at 10:35:43PM +0200, Franck Bui-Huu wrote:

> This patch reduces the kernel image size by making these 2 arrays
> automatic variables.
> 
> 	tlbex.o~old  =>  tlbex.o
> 	 text:     9840     9812      -28  0%
> 	 data:     3904     1344    -2560 -65%
> 	  bss:     1568     1568        0  0%
> 	total:    15312    12724    -2588 -16%
> 
> It increases the stack pressure a lot (more than 2500 bytes) but
> at this stage in the boot process, it shouldn't matter.

Even more for 64-bit kernel - and I would really like to keep reduce
the kernel stack for 64-bit kernels, THREAD_SIZE_ORDER 2 is already
slightly painful when memory becomes fragmented.

The other issue is that with CPU plugging (halfbreed patches to add that
to MIPS are around) this code can be called at any time, not only during
early startup when at most a timer interrupt may strike.  Bootmem maybe?

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
  2007-10-10 14:27                         ` Ralf Baechle
@ 2007-10-10 16:17                           ` Maciej W. Rozycki
  2007-10-10 16:42                             ` Ralf Baechle
  2007-10-10 19:29                             ` Franck Bui-Huu
  0 siblings, 2 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-10 16:17 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Wed, 10 Oct 2007, Ralf Baechle wrote:

> > It increases the stack pressure a lot (more than 2500 bytes) but
> > at this stage in the boot process, it shouldn't matter.
> 
> Even more for 64-bit kernel - and I would really like to keep reduce
> the kernel stack for 64-bit kernels, THREAD_SIZE_ORDER 2 is already
> slightly painful when memory becomes fragmented.

 I think the right fix is to implement "__initbss" along the lines of 
"__initdata".

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
  2007-10-10 16:17                           ` Maciej W. Rozycki
@ 2007-10-10 16:42                             ` Ralf Baechle
  2007-10-10 16:55                               ` Geert Uytterhoeven
  2007-10-10 19:29                             ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-10 16:42 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Franck Bui-Huu, Thiemo Seufer, linux-mips

On Wed, Oct 10, 2007 at 05:17:24PM +0100, Maciej W. Rozycki wrote:

> > > It increases the stack pressure a lot (more than 2500 bytes) but
> > > at this stage in the boot process, it shouldn't matter.
> > 
> > Even more for 64-bit kernel - and I would really like to keep reduce
> > the kernel stack for 64-bit kernels, THREAD_SIZE_ORDER 2 is already
> > slightly painful when memory becomes fragmented.
> 
>  I think the right fix is to implement "__initbss" along the lines of 
> "__initdata".

Indeed.  Doesn't even look so hard and would likely generally be welcome.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
  2007-10-10 16:42                             ` Ralf Baechle
@ 2007-10-10 16:55                               ` Geert Uytterhoeven
  2007-10-10 17:01                                 ` Maciej W. Rozycki
  2007-10-10 19:58                                 ` Franck Bui-Huu
  0 siblings, 2 replies; 93+ messages in thread
From: Geert Uytterhoeven @ 2007-10-10 16:55 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Maciej W. Rozycki, Franck Bui-Huu, Thiemo Seufer, linux-mips

On Wed, 10 Oct 2007, Ralf Baechle wrote:
> On Wed, Oct 10, 2007 at 05:17:24PM +0100, Maciej W. Rozycki wrote:
> > > > It increases the stack pressure a lot (more than 2500 bytes) but
> > > > at this stage in the boot process, it shouldn't matter.
> > > 
> > > Even more for 64-bit kernel - and I would really like to keep reduce
> > > the kernel stack for 64-bit kernels, THREAD_SIZE_ORDER 2 is already
> > > slightly painful when memory becomes fragmented.
> > 
> >  I think the right fix is to implement "__initbss" along the lines of 
> > "__initdata".

Or e.g. static struct label labels[128] __initdata = { 0, };
Cfr. the old rule `always initialize initdata, even if it must be 0'.

> Indeed.  Doesn't even look so hard and would likely generally be welcome.

That's a valid alternative, of course...

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
  2007-10-10 16:55                               ` Geert Uytterhoeven
@ 2007-10-10 17:01                                 ` Maciej W. Rozycki
  2007-10-10 17:09                                   ` Geert Uytterhoeven
  2007-10-10 19:58                                 ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-10 17:01 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Ralf Baechle, Franck Bui-Huu, Thiemo Seufer, linux-mips

On Wed, 10 Oct 2007, Geert Uytterhoeven wrote:

> Or e.g. static struct label labels[128] __initdata = { 0, };
> Cfr. the old rule `always initialize initdata, even if it must be 0'.

 But this will not reduce the size of the kernel image, which is the 
objective here.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
  2007-10-10 17:01                                 ` Maciej W. Rozycki
@ 2007-10-10 17:09                                   ` Geert Uytterhoeven
  0 siblings, 0 replies; 93+ messages in thread
From: Geert Uytterhoeven @ 2007-10-10 17:09 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ralf Baechle, Franck Bui-Huu, Thiemo Seufer, linux-mips

On Wed, 10 Oct 2007, Maciej W. Rozycki wrote:
> On Wed, 10 Oct 2007, Geert Uytterhoeven wrote:
> > Or e.g. static struct label labels[128] __initdata = { 0, };
> > Cfr. the old rule `always initialize initdata, even if it must be 0'.
> 
>  But this will not reduce the size of the kernel image, which is the 
> objective here.

That's true.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
  2007-10-10 16:17                           ` Maciej W. Rozycki
  2007-10-10 16:42                             ` Ralf Baechle
@ 2007-10-10 19:29                             ` Franck Bui-Huu
  1 sibling, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-10 19:29 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

Maciej W. Rozycki wrote:
> On Wed, 10 Oct 2007, Ralf Baechle wrote:
> 
>>> It increases the stack pressure a lot (more than 2500 bytes) but
>>> at this stage in the boot process, it shouldn't matter.
>> Even more for 64-bit kernel - and I would really like to keep reduce
>> the kernel stack for 64-bit kernels, THREAD_SIZE_ORDER 2 is already
>> slightly painful when memory becomes fragmented.
> 
>  I think the right fix is to implement "__initbss" along the lines of 
> "__initdata".
> 

yes I think so.

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section
  2007-10-10 16:55                               ` Geert Uytterhoeven
  2007-10-10 17:01                                 ` Maciej W. Rozycki
@ 2007-10-10 19:58                                 ` Franck Bui-Huu
  1 sibling, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-10 19:58 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Ralf Baechle, Maciej W. Rozycki, Thiemo Seufer, linux-mips

Geert Uytterhoeven wrote:
> Or e.g. static struct label labels[128] __initdata = { 0, };
> Cfr. the old rule `always initialize initdata, even if it must be 0'.
> 

I also noticed that init data aren't initialized as they should be,
but they're still part of initdata not bss.

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [SPAM?]  Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-10 12:08                                 ` [SPAM?] " Nigel Stephens
@ 2007-10-11 12:01                                   ` Maciej W. Rozycki
  2007-10-13 10:53                                     ` Richard Sandiford
  2007-10-14 19:37                                     ` Franck Bui-Huu
  0 siblings, 2 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-11 12:01 UTC (permalink / raw)
  To: Nigel Stephens; +Cc: Franck Bui-Huu, Ralf Baechle, Thiemo Seufer, linux-mips

On Wed, 10 Oct 2007, Nigel Stephens wrote:

> Actually the -march=4ksd option will allow gcc to use of the SmartMIPS lwxs
> (indexed load) instruction, which could save a few instructions here and
> there.

 Good point, but if we decide the lone instruction is worth the hassle, 
then we should use "-msmartmips" on top of the base ISA selection.  
Likewise with "lwx" and "-mdsp".

 Though either way I am not sure these would have to be put in Kconfig or 
Makefile anywhere.  A generic way should be enough for the insistent as 
the potentially useful options may proliferate; we have the CFLAGS_KERNEL 
and CFLAGS_MODULE Makefile variables that would suit for setting upon 
`make' invocation.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 1/6] tlbex.c: Cleanup __init usages.
  2007-10-09 20:34                       ` [PATCH 1/6] tlbex.c: Cleanup __init usages Franck Bui-Huu
@ 2007-10-11 16:16                         ` Ralf Baechle
  2007-10-12  6:36                           ` Franck Bui-Huu
  0 siblings, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-10-11 16:16 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

On Tue, Oct 09, 2007 at 10:34:26PM +0200, Franck Bui-Huu wrote:

> Subject: [PATCH 1/6] tlbex.c: Cleanup __init usages.  

So I applied this one only while we sort out the .init.bss stuff.

Thanks!

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 1/6] tlbex.c: Cleanup __init usages.
  2007-10-11 16:16                         ` Ralf Baechle
@ 2007-10-12  6:36                           ` Franck Bui-Huu
  2007-10-12 14:43                             ` Ralf Baechle
  0 siblings, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-12  6:36 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

Ralf Baechle wrote:
> 
> So I applied this one only while we sort out the .init.bss stuff.
> 

Could you drop this one too ?

The patchset I sent was badly ordered I should have put clean up
stuffs first then optimization stuff.

I'm going to resend a pachset that deals with clean up only.

Thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 1/6] tlbex.c: Cleanup __init usages.
  2007-10-12  6:36                           ` Franck Bui-Huu
@ 2007-10-12 14:43                             ` Ralf Baechle
  0 siblings, 0 replies; 93+ messages in thread
From: Ralf Baechle @ 2007-10-12 14:43 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, linux-mips

On Fri, Oct 12, 2007 at 08:36:07AM +0200, Franck Bui-Huu wrote:

> Could you drop this one too ?
> 
> The patchset I sent was badly ordered I should have put clean up
> stuffs first then optimization stuff.
> 
> I'm going to resend a pachset that deals with clean up only.

Okay, reverted.  It was causing mostpost warnings anyway ...

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [SPAM?]  Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-11 12:01                                   ` Maciej W. Rozycki
@ 2007-10-13 10:53                                     ` Richard Sandiford
  2007-10-15 13:17                                       ` Maciej W. Rozycki
  2007-10-14 19:37                                     ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Richard Sandiford @ 2007-10-13 10:53 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Nigel Stephens, Franck Bui-Huu, Ralf Baechle, Thiemo Seufer,
	linux-mips

"Maciej W. Rozycki" <macro@linux-mips.org> writes:
> On Wed, 10 Oct 2007, Nigel Stephens wrote:
>> Actually the -march=4ksd option will allow gcc to use of the SmartMIPS lwxs
>> (indexed load) instruction, which could save a few instructions here and
>> there.
>
>  Good point, but if we decide the lone instruction is worth the hassle, 
> then we should use "-msmartmips" on top of the base ISA selection.  
> Likewise with "lwx" and "-mdsp".

For the record, although that's true of SDE, it isn't (yet) true of
FSF GCC; you need -msmartmips for that.

Richard

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-10 11:58                               ` Maciej W. Rozycki
  2007-10-10 12:08                                 ` [SPAM?] " Nigel Stephens
@ 2007-10-14 19:32                                 ` Franck Bui-Huu
  2007-10-14 19:53                                   ` Thiemo Seufer
  1 sibling, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-14 19:32 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ralf Baechle, Thiemo Seufer, linux-mips

Maciej W. Rozycki wrote:
> On Tue, 9 Oct 2007, Franck Bui-Huu wrote:
> 
>>>  What would be the gain for the kernel from using "-march=4ksd" rather 
>>> than "-march=mips32r2"?
>>>
>> It actually results in a kernel image ~30kbytes smaller for the former
>> case. It has been discussed sometimes ago on this list. I'm sorry but
>> I don't know why...
> 
>  Perhaps the pipeline description for the 4KSd CPU is different from the 
> default for the MIPS32r2 ISA.  Barring a study of GCC sources, if that 
> really troubles you, you could build the same version of the kernel with 
> these options:
> 
> 1. "-march=mips32r2"
> 
> 2. "-march=4ksd"
> 
> 3. "-march=mips32r2 -mtune=4ksd"
> 
> and compare the results.  I expect the results of #2 and #3 to be the same 
> and it would just back up my suggestion about keeping CPU-specific 
> optimisations separate from the CPU selection.

I think you misunderstood me, my own fault: the kernel was smaller
with "-march=4ksd". It was bigger when using "-march=mips32r2 -smartmips".

I was using SDE gcc.

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [SPAM?]  Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-11 12:01                                   ` Maciej W. Rozycki
  2007-10-13 10:53                                     ` Richard Sandiford
@ 2007-10-14 19:37                                     ` Franck Bui-Huu
  2007-10-15 13:26                                       ` Maciej W. Rozycki
  1 sibling, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-14 19:37 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Nigel Stephens, Ralf Baechle, Thiemo Seufer, linux-mips

Maciej W. Rozycki wrote:
> 
>  Though either way I am not sure these would have to be put in Kconfig or 
> Makefile anywhere.  A generic way should be enough for the insistent as 
> the potentially useful options may proliferate; we have the CFLAGS_KERNEL 
> and CFLAGS_MODULE Makefile variables that would suit for setting upon 
> `make' invocation.
> 

In that case very few people would use this optimization.

We could just have one new Kconfig option in kernel hacking submenu:

config EXTRA_CFLAGS
	string
	help
	  If you want to pass additionnal option to GCC
	  for optimization purpose for example, use this.


		Franck 

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-14 19:32                                 ` Franck Bui-Huu
@ 2007-10-14 19:53                                   ` Thiemo Seufer
  2007-10-14 20:29                                     ` Franck Bui-Huu
  2007-10-15 19:35                                     ` Franck Bui-Huu
  0 siblings, 2 replies; 93+ messages in thread
From: Thiemo Seufer @ 2007-10-14 19:53 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Maciej W. Rozycki, Ralf Baechle, linux-mips

Franck Bui-Huu wrote:
> Maciej W. Rozycki wrote:
> > On Tue, 9 Oct 2007, Franck Bui-Huu wrote:
> > 
> >>>  What would be the gain for the kernel from using "-march=4ksd" rather 
> >>> than "-march=mips32r2"?
> >>>
> >> It actually results in a kernel image ~30kbytes smaller for the former
> >> case. It has been discussed sometimes ago on this list. I'm sorry but
> >> I don't know why...
> > 
> >  Perhaps the pipeline description for the 4KSd CPU is different from the 
> > default for the MIPS32r2 ISA.  Barring a study of GCC sources, if that 
> > really troubles you, you could build the same version of the kernel with 
> > these options:
> > 
> > 1. "-march=mips32r2"
> > 
> > 2. "-march=4ksd"
> > 
> > 3. "-march=mips32r2 -mtune=4ksd"
> > 
> > and compare the results.  I expect the results of #2 and #3 to be the same 
> > and it would just back up my suggestion about keeping CPU-specific 
> > optimisations separate from the CPU selection.
> 
> I think you misunderstood me, my own fault: the kernel was smaller
> with "-march=4ksd". It was bigger when using "-march=mips32r2 -smartmips".

Could you check what "-march=mips32r2 -smartmips -mtune=4ksd" does?
I expect it to have the same result than "-march=4ksd".


Thiemo

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-14 19:53                                   ` Thiemo Seufer
@ 2007-10-14 20:29                                     ` Franck Bui-Huu
  2007-10-15 19:35                                     ` Franck Bui-Huu
  1 sibling, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-14 20:29 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Maciej W. Rozycki, Ralf Baechle, linux-mips

Thiemo Seufer wrote:
> Could you check what "-march=mips32r2 -smartmips -mtune=4ksd" does?
> I expect it to have the same result than "-march=4ksd".
> 

I'll give it a try tomorrow.

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [SPAM?]  Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-13 10:53                                     ` Richard Sandiford
@ 2007-10-15 13:17                                       ` Maciej W. Rozycki
  0 siblings, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-15 13:17 UTC (permalink / raw)
  To: Richard Sandiford
  Cc: Nigel Stephens, Franck Bui-Huu, Ralf Baechle, Thiemo Seufer,
	linux-mips

On Sat, 13 Oct 2007, Richard Sandiford wrote:

> >  Good point, but if we decide the lone instruction is worth the hassle, 
> > then we should use "-msmartmips" on top of the base ISA selection.  
> > Likewise with "lwx" and "-mdsp".
> 
> For the record, although that's true of SDE, it isn't (yet) true of
> FSF GCC; you need -msmartmips for that.

 Ah, another argument in favour to the generic approach...  Thanks for the 
point.

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [SPAM?]  Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-14 19:37                                     ` Franck Bui-Huu
@ 2007-10-15 13:26                                       ` Maciej W. Rozycki
  0 siblings, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-15 13:26 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Nigel Stephens, Ralf Baechle, Thiemo Seufer, linux-mips

On Sun, 14 Oct 2007, Franck Bui-Huu wrote:

> >  Though either way I am not sure these would have to be put in Kconfig or 
> > Makefile anywhere.  A generic way should be enough for the insistent as 
> > the potentially useful options may proliferate; we have the CFLAGS_KERNEL 
> > and CFLAGS_MODULE Makefile variables that would suit for setting upon 
> > `make' invocation.
> 
> In that case very few people would use this optimization.

 It's their problem then, isn't it?

> We could just have one new Kconfig option in kernel hacking submenu:
> 
> config EXTRA_CFLAGS
> 	string
> 	help
> 	  If you want to pass additionnal option to GCC
> 	  for optimization purpose for example, use this.

 I don't think we want to see clueless reports from people who have not 
bothered themselves to understand how `make' works and happened to put 
some rubbish into CONFIG_EXTRA_CFLAGS.  Do we?

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-14 19:53                                   ` Thiemo Seufer
  2007-10-14 20:29                                     ` Franck Bui-Huu
@ 2007-10-15 19:35                                     ` Franck Bui-Huu
  2007-10-15 20:11                                       ` Nigel Stephens
  1 sibling, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-15 19:35 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Maciej W. Rozycki, Ralf Baechle, linux-mips

Thiemo Seufer wrote:
> Could you check what "-march=mips32r2 -smartmips -mtune=4ksd" does?
> I expect it to have the same result than "-march=4ksd".
> 

OK, I give it a try and here are some figures:

$ mipsel-linux-size mipssde-6.05.00-20061023/vmlinux~*
   text    data     bss     dec     hex filename
1446130   58456   93056 1597642  1860ca mipssde-6.05.00-20061023/vmlinux~4ksd
1472034   58456   93056 1623546  18c5fa mipssde-6.05.00-20061023/vmlinux~mips32r2-smartmips
1446130   58456   93056 1597642  1860ca mipssde-6.05.00-20061023/vmlinux~mips32r2-smartmips-mtune4ksd

So you're right "-march=mips32r2 -smartmips -mtune=4ksd" gives the
same result as "-march=4ksd"

And the extra space given by "-march=mips32r2 -smartmips" is coming
from some additional nop instructions:

$ mipsel-linux-objdump -D vmlinux~mips32r2-smartmips > vmlinux~mips32r2-smartmips.S
$ mipsel-linux-objdump -D vmlinux~4ksd > vmlinux~4ksd.S
$ grep -c nop *.S
vmlinux~4ksd.S:18708
vmlinux~mips32r2-smartmips.S:27895

It seems that these extra nops are used for load delays. For example:

vmlinux~4ksd.S:
--------------
<snip>
c00008b4:      8fa40040        lw      a0,64(sp)
c00008b8:      27a40018        addiu   a0,sp,24
c00008bc:      0c000148        jal     c0000520 <try_name>
<snip>

vmlinux~mips32r2-smartmips.S:
---------------------------
c00008b8:      8fa40040        lw      a0,64(sp)
c00008bc:      00000000        nop
c00008c0:      27a40018        addiu   a0,sp,24
c00008c4:      0c000148        jal     c0000520 <try_name>

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-15 19:35                                     ` Franck Bui-Huu
@ 2007-10-15 20:11                                       ` Nigel Stephens
  2007-10-16  8:24                                         ` Franck Bui-Huu
  0 siblings, 1 reply; 93+ messages in thread
From: Nigel Stephens @ 2007-10-15 20:11 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, Ralf Baechle, linux-mips



Franck Bui-Huu wrote:
> Thiemo Seufer wrote:
>   
>> Could you check what "-march=mips32r2 -smartmips -mtune=4ksd" does?
>> I expect it to have the same result than "-march=4ksd".
>>
>>     
>
> OK, I give it a try and here are some figures:
>
> $ mipsel-linux-size mipssde-6.05.00-20061023/vmlinux~*
>    text    data     bss     dec     hex filename
> 1446130   58456   93056 1597642  1860ca mipssde-6.05.00-20061023/vmlinux~4ksd
> 1472034   58456   93056 1623546  18c5fa mipssde-6.05.00-20061023/vmlinux~mips32r2-smartmips
> 1446130   58456   93056 1597642  1860ca mipssde-6.05.00-20061023/vmlinux~mips32r2-smartmips-mtune4ksd
>
> So you're right "-march=mips32r2 -smartmips -mtune=4ksd" gives the
> same result as "-march=4ksd"
>
>   

IIRC that should be -msmartmips, not -smartmips.

> And the extra space given by "-march=mips32r2 -smartmips" is coming
> from some additional nop instructions:
>
> $ mipsel-linux-objdump -D vmlinux~mips32r2-smartmips > vmlinux~mips32r2-smartmips.S
> $ mipsel-linux-objdump -D vmlinux~4ksd > vmlinux~4ksd.S
> $ grep -c nop *.S
> vmlinux~4ksd.S:18708
> vmlinux~mips32r2-smartmips.S:27895
>
> It seems that these extra nops are used for load delays. For example:
>
> vmlinux~4ksd.S:
> --------------
> <snip>
> c00008b4:      8fa40040        lw      a0,64(sp)
> c00008b8:      27a40018        addiu   a0,sp,24
> c00008bc:      0c000148        jal     c0000520 <try_name>
> <snip>
>
> vmlinux~mips32r2-smartmips.S:
> ---------------------------
> c00008b8:      8fa40040        lw      a0,64(sp)
> c00008bc:      00000000        nop
> c00008c0:      27a40018        addiu   a0,sp,24
> c00008c4:      0c000148        jal     c0000520 <try_name>
>
>   

That's weird: load delay slots should only be required by -march=mips1 
(or no -march)

Are you sure that the -march=mips32r2 option is really getting passed to 
the compiler and assembler?

Nigel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-15 20:11                                       ` Nigel Stephens
@ 2007-10-16  8:24                                         ` Franck Bui-Huu
  2007-10-16 12:58                                           ` Nigel Stephens
  0 siblings, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-16  8:24 UTC (permalink / raw)
  To: Nigel Stephens; +Cc: Thiemo Seufer, Maciej W. Rozycki, Ralf Baechle, linux-mips

Nigel Stephens wrote:
> 
> 
> Franck Bui-Huu wrote:
>> Thiemo Seufer wrote:
>>  
>>> Could you check what "-march=mips32r2 -smartmips -mtune=4ksd" does?
>>> I expect it to have the same result than "-march=4ksd".
>>>
>>>     
>>
>> OK, I give it a try and here are some figures:
>>
>> $ mipsel-linux-size mipssde-6.05.00-20061023/vmlinux~*
>>    text    data     bss     dec     hex filename
>> 1446130   58456   93056 1597642  1860ca
>> mipssde-6.05.00-20061023/vmlinux~4ksd
>> 1472034   58456   93056 1623546  18c5fa
>> mipssde-6.05.00-20061023/vmlinux~mips32r2-smartmips
>> 1446130   58456   93056 1597642  1860ca
>> mipssde-6.05.00-20061023/vmlinux~mips32r2-smartmips-mtune4ksd
>>
>> So you're right "-march=mips32r2 -smartmips -mtune=4ksd" gives the
>> same result as "-march=4ksd"
>>
>>   
> 
> IIRC that should be -msmartmips, not -smartmips.

yes '-msmartmips' is used.

> 
>> And the extra space given by "-march=mips32r2 -smartmips" is coming
>> from some additional nop instructions:
>>
>> $ mipsel-linux-objdump -D vmlinux~mips32r2-smartmips >
>> vmlinux~mips32r2-smartmips.S
>> $ mipsel-linux-objdump -D vmlinux~4ksd > vmlinux~4ksd.S
>> $ grep -c nop *.S
>> vmlinux~4ksd.S:18708
>> vmlinux~mips32r2-smartmips.S:27895
>>
>> It seems that these extra nops are used for load delays. For example:
>>
>> vmlinux~4ksd.S:
>> --------------
>> <snip>
>> c00008b4:      8fa40040        lw      a0,64(sp)
>> c00008b8:      27a40018        addiu   a0,sp,24
>> c00008bc:      0c000148        jal     c0000520 <try_name>
>> <snip>
>>
>> vmlinux~mips32r2-smartmips.S:
>> ---------------------------
>> c00008b8:      8fa40040        lw      a0,64(sp)
>> c00008bc:      00000000        nop
>> c00008c0:      27a40018        addiu   a0,sp,24
>> c00008c4:      0c000148        jal     c0000520 <try_name>
>>
>>   
> 
> That's weird: load delay slots should only be required by -march=mips1
> (or no -march)
> 
> Are you sure that the -march=mips32r2 option is really getting passed to
> the compiler and assembler?
> 

Yes I'm pretty sure:

$ mispel-linux-readelf -h vmlinux~mips32r2-smartmips
File: vmlinux~mips32r2-smartmips
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           MIPS R3000
  Version:                           0x1
  Entry point address:               0xc015e000
  Start of program headers:          52 (bytes into file)
  Start of section headers:          12097028 (bytes into file)
  Flags:                             0x70001001, noreorder, o32, mips32r2
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         1
  Size of section headers:           40 (bytes)
  Number of section headers:         42
  Section header string table index: 39

$ head kernel/.user.o.cmd 
cmd_kernel/user.o := mipsel-linux-gcc -Wp,-MD,kernel/.user.o.d  -nostdinc -isystem /usr/lib/gcc/mipsel-linux/3.4.4/include -D__KERNEL__ -Iinclude  -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -O2  -mabi=32 -G 0 -mno-abicalls -fno-pic -pipe -msoft-float -ffreestanding  -march=mips32r2 -Wa,-mips32r2 -Wa,--trap -msmartmips -Iinclude/asm-mips/mach-usip -Iinclude/asm-mips/mach-generic -D"VMLINUX_LOAD_ADDRESS=0xffffffffc0000000" -fomit-frame-pointer -g  -Wdeclaration-after-statement     -D"KBUILD_STR(s)=\#s" -D"KBUILD_BASENAME=KBUILD_STR(user)"  -D"KBUILD_MODNAME=KBUILD_STR(user)" -c -o kernel/user.o kernel/user.c

deps_kernel/user.o := \
  kernel/user.c \
    $(wildcard include/config/keys.h) \
    $(wildcard include/config/inotify/user.h) \
  include/linux/init.h \
    $(wildcard include/config/modules.h) \
    $(wildcard include/config/hotplug.h) \
    $(wildcard include/config/hotplug/cpu.h) \

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-16  8:24                                         ` Franck Bui-Huu
@ 2007-10-16 12:58                                           ` Nigel Stephens
  2007-10-17  7:56                                             ` Franck Bui-Huu
  0 siblings, 1 reply; 93+ messages in thread
From: Nigel Stephens @ 2007-10-16 12:58 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, Ralf Baechle, linux-mips



Franck Bui-Huu wrote:
> cmd_kernel/user.o := mipsel-linux-gcc -Wp,-MD,kernel/.user.o.d  -nostdinc -isystem /usr/lib/gcc/mipsel-linux/3.4.4/include -D__KERNEL__ -Iinclude  -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -O2  -mabi=32 -G 0 -mno-abicalls -fno-pic -pipe -msoft-float -ffreestanding  -march=mips32r2 -Wa,-mips32r2 -Wa,--trap -msmartmips -Iinclude/asm-mips/mach-usip -Iinclude/asm-mips/mach-generic -D"VMLINUX_LOAD_ADDRESS=0xffffffffc0000000" -fomit-frame-pointer -g  -Wdeclaration-after-statement     -D"KBUILD_STR(s)=\#s" -D"KBUILD_BASENAME=KBUILD_STR(user)"  -D"KBUILD_MODNAME=KBUILD_STR(user)" -c -o kernel/user.o kernel/user.c
>
>   

Could you run that gcc command manually, adding the options "-v 
--save-temps", and post the resulting output messages, and the user.s file.

Nigel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-16 12:58                                           ` Nigel Stephens
@ 2007-10-17  7:56                                             ` Franck Bui-Huu
  2007-10-17 12:30                                               ` Thiemo Seufer
  0 siblings, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-10-17  7:56 UTC (permalink / raw)
  To: Nigel Stephens; +Cc: Thiemo Seufer, Maciej W. Rozycki, Ralf Baechle, linux-mips

[-- Attachment #1: Type: text/plain, Size: 2741 bytes --]

Nigel Stephens wrote:
> 
> Could you run that gcc command manually, adding the options "-v
> --save-temps", and post the resulting output messages, and the user.s file.
> 

Ok, I did it except I used init/do_mounts.c file since it has at least
one nop load delay slot (cf label $L50 in do_mount.s)

Here's the output when adding "-v --save-temps" to the gcc cmd line:

mipsel-linux-gcc: warning: -pipe ignored because -save-temps specified
Reading specs from /usr/lib/gcc/mipsel-linux/3.4.4/specs
Configured with: /var/tmp/releasetool-rpm.tmp/bank-20061023-1709/B-i386-linux-rpm/rpmbuild/BUILD/mipssde-6.05.00/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --disable-gdbtk --enable-languages=c --with-local-prefix=/mipsel-linux/local --disable-multilib --disable-shared --enable-static --disable-threads --target=mipsel-linux --host=i386-linux --build=i386-linux
Thread model: single
gcc version 3.4.4 mipssde-6.05.00-20061023
 /usr/libexec/gcc/mipsel-linux/3.4.4/cc1 -E -quiet -nostdinc -v -Iinclude -Iinclude/asm-mips/mach-generic -U__PIC__ -U__pic__ -D__KERNEL__ -DVMLINUX_LOAD_ADDRESS=0xffffffffc0000000 -DKBUILD_STR(s)=#s -DKBUILD_BASENAME=KBUILD_STR(do_mounts) -DKBUILD_MODNAME=KBUILD_STR(mounts) -isystem /usr/lib/gcc/mipsel-linux/3.4.4/include -include include/linux/autoconf.h -MD init/.do_mounts.o.d init/do_mounts.c -G 0 -mabi=32 -mno-abicalls -msoft-float -march=mips32r2 -msmartmips -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -Wdeclaration-after-statement -fno-strict-aliasing -fno-common -fno-pic -ffreestanding -fomit-frame-pointer -fworking-directory -O2 -o do_mounts.i
#include "..." search starts here:
#include <...> search starts here:
 include
 include/asm-mips/mach-generic
 /usr/lib/gcc/mipsel-linux/3.4.4/include
End of search list.
 /usr/libexec/gcc/mipsel-linux/3.4.4/cc1 -fpreprocessed do_mounts.i -G 0 -quiet -dumpbase do_mounts.c -mabi=32 -mno-abicalls -msoft-float -march=mips32r2 -msmartmips -auxbase-strip init/do_mounts.o -g -O2 -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-function-declaration -Wdeclaration-after-statement -version -fno-strict-aliasing -fno-common -fno-pic -ffreestanding -fomit-frame-pointer -o do_mounts.s
GNU C version 3.4.4 mipssde-6.05.00-20061023 (mipsel-linux)
	compiled by GNU C version 3.3.
GGC heuristics: --param ggc-min-expand=90 --param ggc-min-heapsize=113135
 /usr/lib/gcc/mipsel-linux/3.4.4/../../../../mipsel-linux/bin/as -G 0 -EL -msmartmips -O2 -g -no-mdebug -32 -march=mips32r2 -v -non_shared -mips32r2 --trap -o init/do_mounts.o do_mounts.s
GNU assembler version 2.15.94 (mipsel-linux) using BFD version 2.15.94 mipssde-6.05.00-20061023

I also attached do_mount.s.

		Franck

[-- Attachment #2: do_mounts.s.gz --]
[-- Type: application/x-compressed-tar, Size: 62902 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-17  7:56                                             ` Franck Bui-Huu
@ 2007-10-17 12:30                                               ` Thiemo Seufer
  2007-10-17 13:25                                                 ` Nigel Stephens
  0 siblings, 1 reply; 93+ messages in thread
From: Thiemo Seufer @ 2007-10-17 12:30 UTC (permalink / raw)
  To: Franck Bui-Huu
  Cc: Nigel Stephens, Maciej W. Rozycki, Ralf Baechle, linux-mips

Franck Bui-Huu wrote:
> Nigel Stephens wrote:
> > 
> > Could you run that gcc command manually, adding the options "-v
> > --save-temps", and post the resulting output messages, and the user.s file.
> > 
> 
> Ok, I did it except I used init/do_mounts.c file since it has at least
> one nop load delay slot (cf label $L50 in do_mount.s)

I only see a nop after the final LW, this is alignment for the next label.


Thiemo

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-17 12:30                                               ` Thiemo Seufer
@ 2007-10-17 13:25                                                 ` Nigel Stephens
  2007-10-17 13:31                                                   ` Maciej W. Rozycki
  2007-11-04  8:21                                                   ` Franck Bui-Huu
  0 siblings, 2 replies; 93+ messages in thread
From: Nigel Stephens @ 2007-10-17 13:25 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Franck Bui-Huu, Maciej W. Rozycki, Ralf Baechle, linux-mips



Thiemo Seufer wrote:
> Franck Bui-Huu wrote:
>   
>> Nigel Stephens wrote:
>>     
>>> Could you run that gcc command manually, adding the options "-v
>>> --save-temps", and post the resulting output messages, and the user.s file.
>>>
>>>       
>> Ok, I did it except I used init/do_mounts.c file since it has at least
>> one nop load delay slot (cf label $L50 in do_mount.s)
>>     
>
> I only see a nop after the final LW, this is alignment for the next label.
>
>   

Aha, that probably explains it. Franck is using the "SDE for Linux 
v6.05" toolchain, and in that version of GCC -march=mips32r2 implies a 
default of -mtune=24k. Tuning for 24K implies -falign-loops=8 
-falign-jumps=8 and -falign-functions=8. This is undoubtedly why code 
compiled with "-march=mips32r2 -msmartmips" contains more nops than 
"-march=4ksd".nIn theory the extra nops should also disappear if you 
compile with -Os instead of -O2.

Nigel

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-17 13:25                                                 ` Nigel Stephens
@ 2007-10-17 13:31                                                   ` Maciej W. Rozycki
  2007-11-04  8:21                                                   ` Franck Bui-Huu
  1 sibling, 0 replies; 93+ messages in thread
From: Maciej W. Rozycki @ 2007-10-17 13:31 UTC (permalink / raw)
  To: Nigel Stephens; +Cc: Thiemo Seufer, Franck Bui-Huu, Ralf Baechle, linux-mips

On Wed, 17 Oct 2007, Nigel Stephens wrote:

> Aha, that probably explains it. Franck is using the "SDE for Linux v6.05"
> toolchain, and in that version of GCC -march=mips32r2 implies a default of
> -mtune=24k. Tuning for 24K implies -falign-loops=8 -falign-jumps=8 and
> -falign-functions=8. This is undoubtedly why code compiled with
> "-march=mips32r2 -msmartmips" contains more nops than "-march=4ksd".nIn theory
> the extra nops should also disappear if you compile with -Os instead of -O2.

 Another argument for using "-mtune=" explicitly when tuning for a 
particular CPU implementation is desired. :-)

  Maciej

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-10-17 13:25                                                 ` Nigel Stephens
  2007-10-17 13:31                                                   ` Maciej W. Rozycki
@ 2007-11-04  8:21                                                   ` Franck Bui-Huu
  2007-11-04 17:47                                                     ` Thiemo Seufer
  1 sibling, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-11-04  8:21 UTC (permalink / raw)
  To: Nigel Stephens; +Cc: Thiemo Seufer, Maciej W. Rozycki, Ralf Baechle, linux-mips

Nigel Stephens wrote:
> Aha, that probably explains it. Franck is using the "SDE for Linux
> v6.05" toolchain, and in that version of GCC -march=mips32r2 implies a

[snip]

BTW, are there any other toolchains out there that support smartmips ASE ?

thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-11-04  8:21                                                   ` Franck Bui-Huu
@ 2007-11-04 17:47                                                     ` Thiemo Seufer
  2007-11-04 20:19                                                       ` Franck Bui-Huu
  0 siblings, 1 reply; 93+ messages in thread
From: Thiemo Seufer @ 2007-11-04 17:47 UTC (permalink / raw)
  To: Franck Bui-Huu
  Cc: Nigel Stephens, Maciej W. Rozycki, Ralf Baechle, linux-mips

Franck Bui-Huu wrote:
> Nigel Stephens wrote:
> > Aha, that probably explains it. Franck is using the "SDE for Linux
> > v6.05" toolchain, and in that version of GCC -march=mips32r2 implies a
> 
> [snip]
> 
> BTW, are there any other toolchains out there that support smartmips ASE ?

Latest GCC upstream supports it (in SVN since 2007-07-05).


Thiemo

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-11-04 17:47                                                     ` Thiemo Seufer
@ 2007-11-04 20:19                                                       ` Franck Bui-Huu
  2007-11-05 11:36                                                         ` Ralf Baechle
  2007-11-05 15:58                                                         ` Nigel Stephens
  0 siblings, 2 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-11-04 20:19 UTC (permalink / raw)
  To: Thiemo Seufer; +Cc: Nigel Stephens, Maciej W. Rozycki, Ralf Baechle, linux-mips

Thiemo Seufer wrote:
> 
> Latest GCC upstream supports it (in SVN since 2007-07-05).
> 

Good news although gcc 4.3 release is planed for end of January.

Is SDE gcc going to be obsolete after this release ?

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-11-04 20:19                                                       ` Franck Bui-Huu
@ 2007-11-05 11:36                                                         ` Ralf Baechle
  2007-11-05 21:34                                                           ` Franck Bui-Huu
  2007-11-05 15:58                                                         ` Nigel Stephens
  1 sibling, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-11-05 11:36 UTC (permalink / raw)
  To: Franck Bui-Huu
  Cc: Thiemo Seufer, Nigel Stephens, Maciej W. Rozycki, linux-mips

On Sun, Nov 04, 2007 at 09:19:33PM +0100, Franck Bui-Huu wrote:

> > Latest GCC upstream supports it (in SVN since 2007-07-05).
> > 
> 
> Good news although gcc 4.3 release is planed for end of January.
> 
> Is SDE gcc going to be obsolete after this release ?

As for the kernel I don't really care.  The policy is that a working kernel
must be buildable with a stock gcc.  Which at times is painful, search for
all the great use of .word in include/asm-mips/ ...  This doesn't say
anything against taking advantage of other toolchains such SDE; it's just
the functionality must be there with a vanilla GNU toolchain of the miminum
supported version which currently still is 3.2.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-11-04 20:19                                                       ` Franck Bui-Huu
  2007-11-05 11:36                                                         ` Ralf Baechle
@ 2007-11-05 15:58                                                         ` Nigel Stephens
  2007-11-05 20:43                                                           ` Franck Bui-Huu
  1 sibling, 1 reply; 93+ messages in thread
From: Nigel Stephens @ 2007-11-05 15:58 UTC (permalink / raw)
  To: Franck Bui-Huu; +Cc: Thiemo Seufer, Maciej W. Rozycki, Ralf Baechle, linux-mips



Franck Bui-Huu wrote:
> Thiemo Seufer wrote:
>   
>> Latest GCC upstream supports it (in SVN since 2007-07-05).
>>
>>     
>
> Good news although gcc 4.3 release is planed for end of January.
>   

Franck

A supported toolchain which now includes SmartMIPS support is the 
CodeSourcery MIPS Linux toolchain, based on gcc-4.2, see 
http://www.codesourcery.com/store/catalogue/c3/p17
> Is SDE gcc going to be obsolete after this release ?
>   

A pre-built "bare-iron" SDE configuration of GCC will probably continue 
to exist as part of the MIPS cross-development tools, but we are working 
to ensure that as many as possible of the SDE changes are available 
upstream for use by other Linux or GNU toolchain vendors, and/or those 
who wish to roll their own toolchain.

Nigel

-- 
Nigel Stephens    | 7200 Cambridge Research Park | t:[+44|0]1223 203110
MIPS Technologies | Cambridge, England  CB25 9TL | f:[+44|0]1223 203181

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-11-05 15:58                                                         ` Nigel Stephens
@ 2007-11-05 20:43                                                           ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-11-05 20:43 UTC (permalink / raw)
  To: Nigel Stephens; +Cc: Thiemo Seufer, Maciej W. Rozycki, Ralf Baechle, linux-mips

Nigel Stephens wrote:
> 
> A supported toolchain which now includes SmartMIPS support is the
> CodeSourcery MIPS Linux toolchain, based on gcc-4.2, see
> http://www.codesourcery.com/store/catalogue/c3/p17

Interesting.

It seems that MIPS toolchains are not part of the 'lite edition' though.

Thanks,
		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-11-05 11:36                                                         ` Ralf Baechle
@ 2007-11-05 21:34                                                           ` Franck Bui-Huu
  2007-11-05 23:30                                                             ` Ralf Baechle
  0 siblings, 1 reply; 93+ messages in thread
From: Franck Bui-Huu @ 2007-11-05 21:34 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, Nigel Stephens, Maciej W. Rozycki, linux-mips

Ralf Baechle wrote:
> On Sun, Nov 04, 2007 at 09:19:33PM +0100, Franck Bui-Huu wrote:
> 
>>> Latest GCC upstream supports it (in SVN since 2007-07-05).
>>>
>> Good news although gcc 4.3 release is planed for end of January.
>>
>> Is SDE gcc going to be obsolete after this release ?
> 
> As for the kernel I don't really care.  The policy is that a working kernel
> must be buildable with a stock gcc.  Which at times is painful, search for
> all the great use of .word in include/asm-mips/ ...  This doesn't say
> anything against taking advantage of other toolchains such SDE; it's just

It's actually hard to know the advantages of using SDE over a stock gcc.
The only difference I'm aware of is the smartmips ASE support in SDE.
But since this support is going to be added in stock gccs, I don't see
any advantages now and I'm wondering if I can give up using SDE...

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-11-05 21:34                                                           ` Franck Bui-Huu
@ 2007-11-05 23:30                                                             ` Ralf Baechle
  2007-11-06  7:23                                                               ` Franck Bui-Huu
  0 siblings, 1 reply; 93+ messages in thread
From: Ralf Baechle @ 2007-11-05 23:30 UTC (permalink / raw)
  To: Franck Bui-Huu
  Cc: Thiemo Seufer, Nigel Stephens, Maciej W. Rozycki, linux-mips

On Mon, Nov 05, 2007 at 10:34:58PM +0100, Franck Bui-Huu wrote:

> It's actually hard to know the advantages of using SDE over a stock gcc.
> The only difference I'm aware of is the smartmips ASE support in SDE.
> But since this support is going to be added in stock gccs, I don't see
> any advantages now and I'm wondering if I can give up using SDE...

In general terms the toolchain has been more reliable and had better
support for MIPS processors before FSF gcc.

  Ralf

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH] mm/pg-r4k.c: Dump the generated code
  2007-11-05 23:30                                                             ` Ralf Baechle
@ 2007-11-06  7:23                                                               ` Franck Bui-Huu
  0 siblings, 0 replies; 93+ messages in thread
From: Franck Bui-Huu @ 2007-11-06  7:23 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Thiemo Seufer, Nigel Stephens, Maciej W. Rozycki, linux-mips

Ralf Baechle wrote:
> On Mon, Nov 05, 2007 at 10:34:58PM +0100, Franck Bui-Huu wrote:
> 
>> It's actually hard to know the advantages of using SDE over a stock gcc.
>> The only difference I'm aware of is the smartmips ASE support in SDE.
>> But since this support is going to be added in stock gccs, I don't see
>> any advantages now and I'm wondering if I can give up using SDE...
> 
> In general terms the toolchain has been more reliable and had better
> support for MIPS processors before FSF gcc.
> 

Good point.

		Franck

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2007-11-06  7:24 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-02 13:54 [PATCH] mm/pg-r4k.c: Dump the generated code Maciej W. Rozycki
2007-10-02 14:11 ` Thiemo Seufer
2007-10-02 15:49   ` Ralf Baechle
2007-10-02 16:03     ` Thiemo Seufer
2007-10-02 16:08     ` Maciej W. Rozycki
2007-10-03  1:00       ` Ralf Baechle
2007-10-03  7:05         ` Geert Uytterhoeven
2007-10-03 10:32           ` Ralf Baechle
2007-10-03 12:17     ` Franck Bui-Huu
2007-10-03 13:11       ` Thiemo Seufer
2007-10-03 13:51         ` Maciej W. Rozycki
2007-10-03 19:45         ` Franck Bui-Huu
2007-10-03 20:18           ` Thiemo Seufer
2007-10-04  7:33             ` Franck Bui-Huu
2007-10-04 10:30               ` Maciej W. Rozycki
2007-10-04 12:15               ` Ralf Baechle
2007-10-04 15:01                 ` Franck Bui-Huu
2007-10-04 15:23                   ` Maciej W. Rozycki
2007-10-04 15:30                     ` Ralf Baechle
2007-10-04 15:35                       ` Maciej W. Rozycki
2007-10-04 15:42                         ` Ralf Baechle
2007-10-04 17:34                           ` Maciej W. Rozycki
2007-10-08 15:46                             ` Maciej W. Rozycki
2007-10-08 16:41                               ` Ralf Baechle
2007-10-08 16:45                                 ` Maciej W. Rozycki
2007-10-08 16:53                                   ` Ralf Baechle
2007-10-05  8:03                     ` Franck Bui-Huu
2007-10-05  9:09                       ` Geert Uytterhoeven
2007-10-08 15:02                         ` Franck Bui-Huu
2007-10-08 15:21                           ` Geert Uytterhoeven
2007-10-08 15:26                             ` Ralf Baechle
2007-10-09 20:20                             ` Franck Bui-Huu
2007-10-05 12:19                       ` Maciej W. Rozycki
2007-10-08 14:48                         ` Franck Bui-Huu
2007-10-08 15:24                           ` Ralf Baechle
2007-10-08 15:39                           ` Maciej W. Rozycki
2007-10-09 20:17                             ` Franck Bui-Huu
2007-10-10 11:58                               ` Maciej W. Rozycki
2007-10-10 12:08                                 ` [SPAM?] " Nigel Stephens
2007-10-11 12:01                                   ` Maciej W. Rozycki
2007-10-13 10:53                                     ` Richard Sandiford
2007-10-15 13:17                                       ` Maciej W. Rozycki
2007-10-14 19:37                                     ` Franck Bui-Huu
2007-10-15 13:26                                       ` Maciej W. Rozycki
2007-10-14 19:32                                 ` Franck Bui-Huu
2007-10-14 19:53                                   ` Thiemo Seufer
2007-10-14 20:29                                     ` Franck Bui-Huu
2007-10-15 19:35                                     ` Franck Bui-Huu
2007-10-15 20:11                                       ` Nigel Stephens
2007-10-16  8:24                                         ` Franck Bui-Huu
2007-10-16 12:58                                           ` Nigel Stephens
2007-10-17  7:56                                             ` Franck Bui-Huu
2007-10-17 12:30                                               ` Thiemo Seufer
2007-10-17 13:25                                                 ` Nigel Stephens
2007-10-17 13:31                                                   ` Maciej W. Rozycki
2007-11-04  8:21                                                   ` Franck Bui-Huu
2007-11-04 17:47                                                     ` Thiemo Seufer
2007-11-04 20:19                                                       ` Franck Bui-Huu
2007-11-05 11:36                                                         ` Ralf Baechle
2007-11-05 21:34                                                           ` Franck Bui-Huu
2007-11-05 23:30                                                             ` Ralf Baechle
2007-11-06  7:23                                                               ` Franck Bui-Huu
2007-11-05 15:58                                                         ` Nigel Stephens
2007-11-05 20:43                                                           ` Franck Bui-Huu
2007-10-10  8:53                             ` Ralf Baechle
2007-10-10 12:17                               ` Maciej W. Rozycki
2007-10-05 11:51                   ` Ralf Baechle
2007-10-08 14:11                     ` Franck Bui-Huu
2007-10-08 14:41                       ` Ralf Baechle
2007-10-09 20:33                     ` Franck Bui-Huu
2007-10-09 20:34                       ` [PATCH 1/6] tlbex.c: Cleanup __init usages Franck Bui-Huu
2007-10-11 16:16                         ` Ralf Baechle
2007-10-12  6:36                           ` Franck Bui-Huu
2007-10-12 14:43                             ` Ralf Baechle
2007-10-09 20:35                       ` [PATCH 2/6] tlbex.c: Remove relocs[] and labels[] from the init.data section Franck Bui-Huu
2007-10-09 20:35                         ` Franck Bui-Huu
2007-10-10 14:27                         ` Ralf Baechle
2007-10-10 16:17                           ` Maciej W. Rozycki
2007-10-10 16:42                             ` Ralf Baechle
2007-10-10 16:55                               ` Geert Uytterhoeven
2007-10-10 17:01                                 ` Maciej W. Rozycki
2007-10-10 17:09                                   ` Geert Uytterhoeven
2007-10-10 19:58                                 ` Franck Bui-Huu
2007-10-10 19:29                             ` Franck Bui-Huu
2007-10-09 20:36                       ` [PATCH 3/6] tlbex.c: remove tlb_handler[] from " Franck Bui-Huu
2007-10-09 20:36                         ` Franck Bui-Huu
2007-10-09 20:37                       ` [PATCH 4/6] tlbex.c: remove final_handler[] " Franck Bui-Huu
2007-10-09 20:37                         ` Franck Bui-Huu
2007-10-09 20:38                       ` [PATCH 5/6] tlbex.c: cleanup debug code Franck Bui-Huu
2007-10-09 20:38                         ` Franck Bui-Huu
2007-10-09 20:39                       ` [PATCH 6/6] tlbex.c: cleanup include files Franck Bui-Huu
2007-10-09 20:39                         ` Franck Bui-Huu
2007-10-03 13:41       ` [PATCH] mm/pg-r4k.c: Dump the generated code Ralf Baechle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.