public inbox for linux-sh@vger.kernel.org
 help / color / mirror / Atom feed
* qemu-sh CF access perormance
@ 2009-03-31 13:19 Shin-ichiro KAWASAKI
  2009-04-01  4:22 ` Magnus Damm
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Shin-ichiro KAWASAKI @ 2009-03-31 13:19 UTC (permalink / raw)
  To: linux-sh

Hi.  I have a question related to qemu-sh.

I use qemu-sh r2d emulation and userland on compact flush as qemu disk image,
to investigate why qemu-sh system emulation is slower than qemu-arm.

For example, it takes only around 1 second to compile simple hello.c with
gcc on qemu-arm system emulation.  On the other hand, it takes around 40
seconds on qemu-sh system emulation.

This compile time (40 secs) reduces to less than 6 seconds, if I repeat to
invoke "% gcc hello.c".  Then I guess,

 - disk cache reduces the compile time
 - and compact flush access performance is rather bad on qemu-sh r2d system

I investigated what happens on CF access, and found

 - the ioread/write16_rep() function call in ata_sff_data_xfer() placed in
   "drivers/ata/libata-sff.c", cause tlb miss exception,

 - and the exception is handled by handle_trapped_io() in
   "arch/sh/kernel/io_trapped.c".

I'd like to ask following questions to linux-sh experts,

 - Why such io-traps are used to access CF?
 - Will this io-traps are used for SH7785LCR's SD card access?

I guess these io-traps can be the reason why gcc takes so much time on qemu-sh.

Thank you for reading this mail.
Any comments or any explanations will be appreciated.

Regards,
Shin-ichiro KAWASAKI

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: qemu-sh CF access perormance
  2009-03-31 13:19 qemu-sh CF access perormance Shin-ichiro KAWASAKI
@ 2009-04-01  4:22 ` Magnus Damm
  2009-04-01 15:25 ` Shin-ichiro KAWASAKI
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Magnus Damm @ 2009-04-01  4:22 UTC (permalink / raw)
  To: linux-sh

2009/3/31 Shin-ichiro KAWASAKI <kawasaki@juno.dti.ne.jp>:
> I'd like to ask following questions to linux-sh experts,
>
>  - Why such io-traps are used to access CF?
>  - Will this io-traps are used for SH7785LCR's SD card access?
>
> I guess these io-traps can be the reason why gcc takes so much time on qemu-sh.

The r2d hardware implements 16-bit only CF interface while driver
software requires 8-bit access. To work around this issue io_trapped
is used to convert 8-bit accesses to 16-bit accesses. This is quite
slow.

For more information, please see the comment in arch/sh/boards/mach-r2d/setup.c

To improve performance, consider adding a command line flag to the
kernel that disables io_trapped. This flag can then be set on the
kernel command line by the person running qemu.

Hope this helps!

Cheers,

/ magnus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: qemu-sh CF access perormance
  2009-03-31 13:19 qemu-sh CF access perormance Shin-ichiro KAWASAKI
  2009-04-01  4:22 ` Magnus Damm
@ 2009-04-01 15:25 ` Shin-ichiro KAWASAKI
  2009-04-02  3:32 ` Paul Mundt
  2009-04-02 14:36 ` Shin-ichiro KAWASAKI
  3 siblings, 0 replies; 5+ messages in thread
From: Shin-ichiro KAWASAKI @ 2009-04-01 15:25 UTC (permalink / raw)
  To: linux-sh

Hi, Magnus!  Thank you for your explanation.

Magnus Damm wrote:
> 2009/3/31 Shin-ichiro KAWASAKI <kawasaki@juno.dti.ne.jp>:
>> I'd like to ask following questions to linux-sh experts,
>>
>>  - Why such io-traps are used to access CF?
>>  - Will this io-traps are used for SH7785LCR's SD card access?
>>
>> I guess these io-traps can be the reason why gcc takes so much time on qemu-sh.
> 
> The r2d hardware implements 16-bit only CF interface while driver
> software requires 8-bit access. To work around this issue io_trapped
> is used to convert 8-bit accesses to 16-bit accesses. This is quite
> slow.
> 
> For more information, please see the comment in arch/sh/boards/mach-r2d/setup.c
> 
> To improve performance, consider adding a command line flag to the
> kernel that disables io_trapped. This flag can then be set on the
> kernel command line by the person running qemu.
> 
> Hope this helps!

It really helps!

The attached patch is a rough implementation to add a command line
flag 'avoid_trap', which Magnus suggested.  It is just a reference,
but it reduces the gcc compile time from 40 seconds to around 4 seconds.
10 times faster!

Is it OK to add such qemu specific options to linux kernel mainline?

Regards,
Shin-ichiro KAWASAKI


--- a/linux-2.6.28/arch/sh/boards/mach-r2d/setup.c	2008-12-25 08:26:37.000000000 +0900
+++ b/linux-2.6.28/arch/sh/boards/mach-r2d/setup.c	2009-04-01 23:38:44.000000000 +0900
@@ -198,9 +198,11 @@
 	.minimum_bus_width	= 16,
 };
 
+static int avoid_trap;
+
 static int __init rts7751r2d_devices_setup(void)
 {
-	if (register_trapped_io(&cf_trapped_io) = 0)
+	if (avoid_trap || register_trapped_io(&cf_trapped_io) = 0)
 		platform_device_register(&cf_ide_device);
 
 	spi_register_board_info(spi_bus, ARRAY_SIZE(spi_bus));
@@ -245,6 +247,9 @@
 
 	sm501_reg = (void __iomem *)0xb3e00000 + SM501_DRAM_CONTROL;
 	writel(readl(sm501_reg) | 0x00f107c0, sm501_reg);
+
+    if (strstr(*cmdline_p, "avoid_trap"))
+        avoid_trap = 1;
 }
 
 /*


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: qemu-sh CF access perormance
  2009-03-31 13:19 qemu-sh CF access perormance Shin-ichiro KAWASAKI
  2009-04-01  4:22 ` Magnus Damm
  2009-04-01 15:25 ` Shin-ichiro KAWASAKI
@ 2009-04-02  3:32 ` Paul Mundt
  2009-04-02 14:36 ` Shin-ichiro KAWASAKI
  3 siblings, 0 replies; 5+ messages in thread
From: Paul Mundt @ 2009-04-02  3:32 UTC (permalink / raw)
  To: linux-sh

On Thu, Apr 02, 2009 at 12:25:19AM +0900, Shin-ichiro KAWASAKI wrote:
> Hi, Magnus!  Thank you for your explanation.
> 
> Magnus Damm wrote:
> >2009/3/31 Shin-ichiro KAWASAKI <kawasaki@juno.dti.ne.jp>:
> >>I'd like to ask following questions to linux-sh experts,
> >>
> >> - Why such io-traps are used to access CF?
> >> - Will this io-traps are used for SH7785LCR's SD card access?
> >>
> >>I guess these io-traps can be the reason why gcc takes so much time on 
> >>qemu-sh.
> >
> >The r2d hardware implements 16-bit only CF interface while driver
> >software requires 8-bit access. To work around this issue io_trapped
> >is used to convert 8-bit accesses to 16-bit accesses. This is quite
> >slow.
> >
> >For more information, please see the comment in 
> >arch/sh/boards/mach-r2d/setup.c
> >
> >To improve performance, consider adding a command line flag to the
> >kernel that disables io_trapped. This flag can then be set on the
> >kernel command line by the person running qemu.
> >
> >Hope this helps!
> 
> It really helps!
> 
> The attached patch is a rough implementation to add a command line
> flag 'avoid_trap', which Magnus suggested.  It is just a reference,
> but it reduces the gcc compile time from 40 seconds to around 4 seconds.
> 10 times faster!
> 
> Is it OK to add such qemu specific options to linux kernel mainline?
> 
Sure, why not. Try this:

---

commit eeee7853c4ffaf5b9eb58f39708e3c78f66cee15
Author: Paul Mundt <lethal@linux-sh.org>
Date:   Thu Apr 2 12:31:16 2009 +0900

    sh: Add a command line option for disabling I/O trapping.
    
    This adds a 'noiotrap' kernel command line option to permit disabling of
    I/O trapping. This is mostly useful for running on emulators where the
    physical device limitations are not an issue.
    
    Signed-off-by: Paul Mundt <lethal@linux-sh.org>

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 240257d..8b2067c 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1544,6 +1544,8 @@ and is between 256 and 4096 characters. It is defined in the file
 			Valid arguments: on, off
 			Default: on
 
+	noiotrap	[SH] Disables trapped I/O port accesses.
+
 	noirqdebug	[X86-32] Disables the code which attempts to detect and
 			disable unhandled interrupt sources.
 
diff --git a/arch/sh/kernel/io_trapped.c b/arch/sh/kernel/io_trapped.c
index 39cd7f3..c22853b 100644
--- a/arch/sh/kernel/io_trapped.c
+++ b/arch/sh/kernel/io_trapped.c
@@ -14,6 +14,7 @@
 #include <linux/bitops.h>
 #include <linux/vmalloc.h>
 #include <linux/module.h>
+#include <linux/init.h>
 #include <asm/system.h>
 #include <asm/mmu_context.h>
 #include <asm/uaccess.h>
@@ -32,6 +33,15 @@ EXPORT_SYMBOL_GPL(trapped_mem);
 #endif
 static DEFINE_SPINLOCK(trapped_lock);
 
+static int trapped_io_disable __read_mostly;
+
+static int __init trapped_io_setup(char *__unused)
+{
+	trapped_io_disable = 1;
+	return 1;
+}
+__setup("noiotrap", trapped_io_setup);
+
 int register_trapped_io(struct trapped_io *tiop)
 {
 	struct resource *res;
@@ -39,6 +49,9 @@ int register_trapped_io(struct trapped_io *tiop)
 	struct page *pages[TRAPPED_PAGES_MAX];
 	int k, n;
 
+	if (unlikely(trapped_io_disable))
+		return 0;
+
 	/* structure must be page aligned */
 	if ((unsigned long)tiop & (PAGE_SIZE - 1))
 		goto bad;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: qemu-sh CF access perormance
  2009-03-31 13:19 qemu-sh CF access perormance Shin-ichiro KAWASAKI
                   ` (2 preceding siblings ...)
  2009-04-02  3:32 ` Paul Mundt
@ 2009-04-02 14:36 ` Shin-ichiro KAWASAKI
  3 siblings, 0 replies; 5+ messages in thread
From: Shin-ichiro KAWASAKI @ 2009-04-02 14:36 UTC (permalink / raw)
  To: linux-sh

Paul Mundt wrote:
> On Thu, Apr 02, 2009 at 12:25:19AM +0900, Shin-ichiro KAWASAKI wrote:
>> Hi, Magnus!  Thank you for your explanation.
>>
>> Magnus Damm wrote:
>>> 2009/3/31 Shin-ichiro KAWASAKI <kawasaki@juno.dti.ne.jp>:
>>>> I'd like to ask following questions to linux-sh experts,
>>>>
>>>> - Why such io-traps are used to access CF?
>>>> - Will this io-traps are used for SH7785LCR's SD card access?
>>>>
>>>> I guess these io-traps can be the reason why gcc takes so much time on 
>>>> qemu-sh.
>>> The r2d hardware implements 16-bit only CF interface while driver
>>> software requires 8-bit access. To work around this issue io_trapped
>>> is used to convert 8-bit accesses to 16-bit accesses. This is quite
>>> slow.
>>>
>>> For more information, please see the comment in 
>>> arch/sh/boards/mach-r2d/setup.c
>>>
>>> To improve performance, consider adding a command line flag to the
>>> kernel that disables io_trapped. This flag can then be set on the
>>> kernel command line by the person running qemu.
>>>
>>> Hope this helps!
>> It really helps!
>>
>> The attached patch is a rough implementation to add a command line
>> flag 'avoid_trap', which Magnus suggested.  It is just a reference,
>> but it reduces the gcc compile time from 40 seconds to around 4 seconds.
>> 10 times faster!
>>
>> Is it OK to add such qemu specific options to linux kernel mainline?
>>
> Sure, why not. Try this:

It works fine, and gains same performance improvement as my dirty patch.
Thank you Paul for your quick clean patch!

Tested-by: Shin-ichiro KAWASAKI <kawasaki@juno.dti.ne.jp>

> ---
> 
> commit eeee7853c4ffaf5b9eb58f39708e3c78f66cee15
> Author: Paul Mundt <lethal@linux-sh.org>
> Date:   Thu Apr 2 12:31:16 2009 +0900
> 
>     sh: Add a command line option for disabling I/O trapping.
>     
>     This adds a 'noiotrap' kernel command line option to permit disabling of
>     I/O trapping. This is mostly useful for running on emulators where the
>     physical device limitations are not an issue.
>     
>     Signed-off-by: Paul Mundt <lethal@linux-sh.org>
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 240257d..8b2067c 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -1544,6 +1544,8 @@ and is between 256 and 4096 characters. It is defined in the file
>  			Valid arguments: on, off
>  			Default: on
>  
> +	noiotrap	[SH] Disables trapped I/O port accesses.
> +
>  	noirqdebug	[X86-32] Disables the code which attempts to detect and
>  			disable unhandled interrupt sources.
>  
> diff --git a/arch/sh/kernel/io_trapped.c b/arch/sh/kernel/io_trapped.c
> index 39cd7f3..c22853b 100644
> --- a/arch/sh/kernel/io_trapped.c
> +++ b/arch/sh/kernel/io_trapped.c
> @@ -14,6 +14,7 @@
>  #include <linux/bitops.h>
>  #include <linux/vmalloc.h>
>  #include <linux/module.h>
> +#include <linux/init.h>
>  #include <asm/system.h>
>  #include <asm/mmu_context.h>
>  #include <asm/uaccess.h>
> @@ -32,6 +33,15 @@ EXPORT_SYMBOL_GPL(trapped_mem);
>  #endif
>  static DEFINE_SPINLOCK(trapped_lock);
>  
> +static int trapped_io_disable __read_mostly;
> +
> +static int __init trapped_io_setup(char *__unused)
> +{
> +	trapped_io_disable = 1;
> +	return 1;
> +}
> +__setup("noiotrap", trapped_io_setup);
> +
>  int register_trapped_io(struct trapped_io *tiop)
>  {
>  	struct resource *res;
> @@ -39,6 +49,9 @@ int register_trapped_io(struct trapped_io *tiop)
>  	struct page *pages[TRAPPED_PAGES_MAX];
>  	int k, n;
>  
> +	if (unlikely(trapped_io_disable))
> +		return 0;
> +
>  	/* structure must be page aligned */
>  	if ((unsigned long)tiop & (PAGE_SIZE - 1))
>  		goto bad;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sh" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-04-02 14:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-31 13:19 qemu-sh CF access perormance Shin-ichiro KAWASAKI
2009-04-01  4:22 ` Magnus Damm
2009-04-01 15:25 ` Shin-ichiro KAWASAKI
2009-04-02  3:32 ` Paul Mundt
2009-04-02 14:36 ` Shin-ichiro KAWASAKI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox