xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0 of 5] Rombios PIO performance
@ 2012-07-30 19:47 Andrew Cooper
  2012-07-30 19:47 ` [PATCH 1 of 5] rombios/keyboard: Don't needlessly poll the status register Andrew Cooper
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Andrew Cooper @ 2012-07-30 19:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Keir Fraser, Jan Beulich

This patch series is the result of some analysis to improve the boot peformance
of HVM guests for XenServer.

Tests were performed with Win7 as the HVM guest, but the fixes apply to all HVM
guests.  All improvements are by reducing the number of PIO traps from rombios,
through Xen to Qemu; some completely needless and some by knowing how Qemu is
going to react.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1 of 5] rombios/keyboard: Don't needlessly poll the status register
  2012-07-30 19:47 [PATCH 0 of 5] Rombios PIO performance Andrew Cooper
@ 2012-07-30 19:47 ` Andrew Cooper
  2012-07-30 19:47 ` [PATCH 2 of 5] rombios/ata: Do not wait for BSY to be set Andrew Cooper
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2012-07-30 19:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Keir Fraser, Jan Beulich

[-- Attachment #1: rombios-keyboard.patch --]
[-- Type: text/x-patch, Size: 693 bytes --]

Repeated polling of the status register is not going to change its value, so
don't needlessly take 8192 traps to Qemu when 1 will do.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r e6266fc76d08 -r 2c92985dc53f tools/firmware/rombios/rombios.c
--- a/tools/firmware/rombios/rombios.c
+++ b/tools/firmware/rombios/rombios.c
@@ -1805,12 +1805,12 @@ keyboard_init()
     while ( (inb(0x64) & 0x02) && (--max>0)) outb(0x80, 0x00);
 
     /* flush incoming keys */
-    max=0x2000;
+    max=2;
     while (--max > 0) {
         outb(0x80, 0x00);
         if (inb(0x64) & 0x01) {
             inb(0x60);
-            max = 0x2000;
+            max = 2;
             }
         }
 

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2 of 5] rombios/ata: Do not wait for BSY to be set
  2012-07-30 19:47 [PATCH 0 of 5] Rombios PIO performance Andrew Cooper
  2012-07-30 19:47 ` [PATCH 1 of 5] rombios/keyboard: Don't needlessly poll the status register Andrew Cooper
@ 2012-07-30 19:47 ` Andrew Cooper
  2012-07-30 20:18   ` Pasi Kärkkäinen
  2012-07-30 19:47 ` [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects Andrew Cooper
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2012-07-30 19:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Keir Fraser, Jan Beulich

[-- Attachment #1: rombios-ata_reset.patch --]
[-- Type: text/x-patch, Size: 992 bytes --]

I can't find any guarantee in the ATA specification that this will happen, and it
certainly does not with Qemu.  SeaBIOS has replaced it with a call to udelay(5)
instead.

As rombios does not have an equivalent udelay(), so replace the wait with a write
to port 0x80 which is whilelisted by Xen for 'a small delay'.

This causes roughly 42k fewer traps to Qemu, which is very roughly 10% of the
number of traps during boot of a Win7 guest.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r 2c92985dc53f -r 59de21ca1546 tools/firmware/rombios/rombios.c
--- a/tools/firmware/rombios/rombios.c
+++ b/tools/firmware/rombios/rombios.c
@@ -2914,8 +2914,8 @@ Bit16u device;
 // 8.2.1 (a) -- set SRST in DC
   outb(iobase2+ATA_CB_DC, ATA_CB_DC_HD15 | ATA_CB_DC_NIEN | ATA_CB_DC_SRST);
 
-// 8.2.1 (b) -- wait for BSY
-  await_ide(BSY, iobase1, 20);
+// 8.2.1 (b) -- wait
+  outb(0x80, 0x00);
 
 // 8.2.1 (f) -- clear SRST
   outb(iobase2+ATA_CB_DC, ATA_CB_DC_HD15 | ATA_CB_DC_NIEN);

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects
  2012-07-30 19:47 [PATCH 0 of 5] Rombios PIO performance Andrew Cooper
  2012-07-30 19:47 ` [PATCH 1 of 5] rombios/keyboard: Don't needlessly poll the status register Andrew Cooper
  2012-07-30 19:47 ` [PATCH 2 of 5] rombios/ata: Do not wait for BSY to be set Andrew Cooper
@ 2012-07-30 19:47 ` Andrew Cooper
  2012-07-30 21:57   ` Alan Cox
  2012-07-30 19:47 ` [PATCH 4 of 5] rombios/ata Remove more needless traps from the int 0x13 path Andrew Cooper
  2012-07-30 19:47 ` [PATCH 5 of 5] rombios/debug: Reduce verbosity of rombios Andrew Cooper
  4 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2012-07-30 19:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Keir Fraser, Jan Beulich

[-- Attachment #1: rombios-await_ide.patch --]
[-- Type: text/x-patch, Size: 667 bytes --]

So taking two traps when one will do is pointless.  This codepath is on the int
0x13 hot path, and removing it has about a 30% reduction in the number of traps
to Qemu during Win7 boot.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r 59de21ca1546 -r 5e8fbfea2229 tools/firmware/rombios/rombios.c
--- a/tools/firmware/rombios/rombios.c
+++ b/tools/firmware/rombios/rombios.c
@@ -2540,7 +2540,6 @@ static int await_ide(when_done,base,time
   Bit32u time=0,last=0;
   Bit16u status;
   Bit8u result;
-  status = inb(base + ATA_CB_STAT); // for the times you're supposed to throw one away
   for(;;) {
     status = inb(base+ATA_CB_STAT);
     time++;

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4 of 5] rombios/ata Remove more needless traps from the int 0x13 path
  2012-07-30 19:47 [PATCH 0 of 5] Rombios PIO performance Andrew Cooper
                   ` (2 preceding siblings ...)
  2012-07-30 19:47 ` [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects Andrew Cooper
@ 2012-07-30 19:47 ` Andrew Cooper
  2012-07-30 19:47 ` [PATCH 5 of 5] rombios/debug: Reduce verbosity of rombios Andrew Cooper
  4 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2012-07-30 19:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Keir Fraser, Jan Beulich

[-- Attachment #1: rombios-await_ide-more.patch --]
[-- Type: text/x-patch, Size: 4130 bytes --]

The return value from await_ide() is always ignored, and most calls to await_ide()
immediately reread the status register.

Therefore, making await_ide() return the last value of the status register removes
a further two traps on the int 0x13 path, with a similar reduction in the total
number of traps during Win7 boot.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r 5e8fbfea2229 -r ba71221a69d0 tools/firmware/rombios/rombios.c
--- a/tools/firmware/rombios/rombios.c
+++ b/tools/firmware/rombios/rombios.c
@@ -2531,14 +2531,14 @@ void ata_init( )
 
 #define IDE_TIMEOUT 32000u //32 seconds max for IDE ops
 
-int await_ide();
-static int await_ide(when_done,base,timeout)
+Bit8u await_ide();
+static Bit8u await_ide(when_done,base,timeout)
   Bit8u when_done;
   Bit16u base;
   Bit16u timeout;
 {
   Bit32u time=0,last=0;
-  Bit16u status;
+  Bit8u status;
   Bit8u result;
   for(;;) {
     status = inb(base+ATA_CB_STAT);
@@ -2556,7 +2556,7 @@ static int await_ide(when_done,base,time
     else if (when_done == TIMEOUT)
       result = 0;
 
-    if (result) return 0;
+    if (result) return status;
     if (time>>16 != last) // mod 2048 each 16 ms
     {
       last = time >>16;
@@ -2565,12 +2565,12 @@ static int await_ide(when_done,base,time
     if (status & ATA_CB_STAT_ERR)
     {
       BX_DEBUG_ATA("await_ide: ERROR (TIMEOUT,BSY,!BSY,!BSY_DRQ,!BSY_!DRQ,!BSY_RDY) %d time= %ld timeout= %d\n",when_done,time>>11, timeout);
-      return -1;
+      return status;
     }
     if ((timeout == 0) || ((time>>11) > timeout)) break;
   }
   BX_INFO("IDE time out\n");
-  return -1;
+  return status;
 }
 
 // ---------------------------------------------------------------------------
@@ -3016,8 +3016,7 @@ Bit32u lba_low, lba_high;
   outb(iobase1 + ATA_CB_DH, (slave ? ATA_CB_DH_DEV1 : ATA_CB_DH_DEV0) | (Bit8u) head );
   outb(iobase1 + ATA_CB_CMD, command);
 
-  await_ide(NOT_BSY_DRQ, iobase1, IDE_TIMEOUT);
-  status = inb(iobase1 + ATA_CB_STAT);
+  status = await_ide(NOT_BSY_DRQ, iobase1, IDE_TIMEOUT);
 
   if (status & ATA_CB_STAT_ERR) {
     BX_DEBUG_ATA("ata_cmd_data_in : read error\n");
@@ -3077,8 +3076,7 @@ ASM_END
     current++;
     write_word(ebda_seg, &EbdaData->ata.trsfsectors,current);
     count--;
-    await_ide(NOT_BSY, iobase1, IDE_TIMEOUT);
-    status = inb(iobase1 + ATA_CB_STAT);
+    status = await_ide(NOT_BSY, iobase1, IDE_TIMEOUT);
     if (count == 0) {
       if ( (status & (ATA_CB_STAT_BSY | ATA_CB_STAT_RDY | ATA_CB_STAT_DRQ | ATA_CB_STAT_ERR) )
           != ATA_CB_STAT_RDY ) {
@@ -3167,8 +3165,7 @@ Bit32u lba_low, lba_high;
   outb(iobase1 + ATA_CB_DH, (slave ? ATA_CB_DH_DEV1 : ATA_CB_DH_DEV0) | (Bit8u) head );
   outb(iobase1 + ATA_CB_CMD, command);
 
-  await_ide(NOT_BSY_DRQ, iobase1, IDE_TIMEOUT);
-  status = inb(iobase1 + ATA_CB_STAT);
+  status = await_ide(NOT_BSY_DRQ, iobase1, IDE_TIMEOUT);
 
   if (status & ATA_CB_STAT_ERR) {
     BX_DEBUG_ATA("ata_cmd_data_out : read error\n");
@@ -3316,8 +3313,7 @@ Bit32u length;
   outb(iobase1 + ATA_CB_CMD, ATA_CMD_PACKET);
 
   // Device should ok to receive command
-  await_ide(NOT_BSY_DRQ, iobase1, IDE_TIMEOUT);
-  status = inb(iobase1 + ATA_CB_STAT);
+  status = await_ide(NOT_BSY_DRQ, iobase1, IDE_TIMEOUT);
 
   if (status & ATA_CB_STAT_ERR) {
     BX_DEBUG_ATA("ata_cmd_packet : error, status is %02x\n",status);
@@ -3353,8 +3349,7 @@ ASM_START
 ASM_END
 
   if (inout == ATA_DATA_NO) {
-    await_ide(NOT_BSY, iobase1, IDE_TIMEOUT);
-    status = inb(iobase1 + ATA_CB_STAT);
+    status = await_ide(NOT_BSY, iobase1, IDE_TIMEOUT);
     }
   else {
         Bit16u loops = 0;
@@ -3363,13 +3358,12 @@ ASM_END
 
       if (loops == 0) {//first time through
         status = inb(iobase2 + ATA_CB_ASTAT);
-        await_ide(NOT_BSY_DRQ, iobase1, IDE_TIMEOUT);
+        status = await_ide(NOT_BSY_DRQ, iobase1, IDE_TIMEOUT);
       }
       else
-        await_ide(NOT_BSY, iobase1, IDE_TIMEOUT);
+        status = await_ide(NOT_BSY, iobase1, IDE_TIMEOUT);
       loops++;
 
-      status = inb(iobase1 + ATA_CB_STAT);
       sc = inb(iobase1 + ATA_CB_SC);
 
       // Check if command completed

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 5 of 5] rombios/debug: Reduce verbosity of rombios
  2012-07-30 19:47 [PATCH 0 of 5] Rombios PIO performance Andrew Cooper
                   ` (3 preceding siblings ...)
  2012-07-30 19:47 ` [PATCH 4 of 5] rombios/ata Remove more needless traps from the int 0x13 path Andrew Cooper
@ 2012-07-30 19:47 ` Andrew Cooper
  4 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2012-07-30 19:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Keir Fraser, Jan Beulich

[-- Attachment #1: rombios-quiet.patch --]
[-- Type: text/x-patch, Size: 864 bytes --]

Default builds of Qemu appear not to log the Bochs debug port, so having rombios
write to the port causes pointless traps.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r ba71221a69d0 -r 47b563da50f7 tools/firmware/rombios/rombios.h
--- a/tools/firmware/rombios/rombios.h
+++ b/tools/firmware/rombios/rombios.h
@@ -48,10 +48,11 @@
 // per-device basis. Debug info are sent only in debug mode
 #if DEBUG_ROMBIOS
 #  define BX_DEBUG(format, p...)  bios_printf(BIOS_PRINTF_INFO, format, ##p)
+#  define BX_INFO(format, p...)   bios_printf(BIOS_PRINTF_INFO, format, ##p)
 #else
 #  define BX_DEBUG(format, p...)
+#  define BX_INFO(format, p...)
 #endif
-#define BX_INFO(format, p...)   bios_printf(BIOS_PRINTF_INFO, format, ##p)
 #define BX_PANIC(format, p...)  bios_printf(BIOS_PRINTF_DEBHALT, format, ##p)
 
 #define ACPI_DATA_SIZE    0x00010000L

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2 of 5] rombios/ata: Do not wait for BSY to be set
  2012-07-30 19:47 ` [PATCH 2 of 5] rombios/ata: Do not wait for BSY to be set Andrew Cooper
@ 2012-07-30 20:18   ` Pasi Kärkkäinen
  2012-07-31  9:37     ` [PATCH 2 of 5] (V2) " Andrew Cooper
  0 siblings, 1 reply; 11+ messages in thread
From: Pasi Kärkkäinen @ 2012-07-30 20:18 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Keir Fraser, Jan Beulich, xen-devel

On Mon, Jul 30, 2012 at 08:47:21PM +0100, Andrew Cooper wrote:
> I can't find any guarantee in the ATA specification that this will happen, and it
> certainly does not with Qemu.  SeaBIOS has replaced it with a call to udelay(5)
> instead.
> 
> As rombios does not have an equivalent udelay(), so replace the wait with a write
> to port 0x80 which is whilelisted by Xen for 'a small delay'.
>                          ^

a small typo probably.. I think it should say "whitelisted".

-- Pasi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects
  2012-07-30 19:47 ` [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects Andrew Cooper
@ 2012-07-30 21:57   ` Alan Cox
  2012-07-31 10:08     ` David Vrabel
  2012-07-31 10:28     ` Andrew Cooper
  0 siblings, 2 replies; 11+ messages in thread
From: Alan Cox @ 2012-07-30 21:57 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Keir Fraser, Jan Beulich, xen-devel

On Mon, 30 Jul 2012 20:47:22 +0100
Andrew Cooper <andrew.cooper3@citrix.com> wrote:

> So taking two traps when one will do is pointless.  This codepath is on the int
> 0x13 hot path, and removing it has about a 30% reduction in the number of traps
> to Qemu during Win7 boot.

You can't read the status for 400nS after a command issue, so throwing
one away is a typical way to handle that.

All of this is optimising the wrong thing.

The problem is that neither kvm not xen have the most basic prediction
handlers in the kernel side exception code so keep hitting qemu.

For a 99% of the ATA transfers you can predict the next few in and outs
and pre-load them into your trap handler avoiding bouncing into qemu, on
a miss you go back into qemu and load the next prediction block (or tree
even)

That's the kind of optimisation that will really make it fly.

Alan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2 of 5] (V2) rombios/ata: Do not wait for BSY to be set
  2012-07-30 20:18   ` Pasi Kärkkäinen
@ 2012-07-31  9:37     ` Andrew Cooper
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2012-07-31  9:37 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: Keir (Xen.org), Jan Beulich, xen-devel@lists.xen.org

[-- Attachment #1: Type: text/plain, Size: 759 bytes --]


On 30/07/12 21:18, Pasi Kärkkäinen wrote:
> On Mon, Jul 30, 2012 at 08:47:21PM +0100, Andrew Cooper wrote:
>> I can't find any guarantee in the ATA specification that this will happen, and it
>> certainly does not with Qemu.  SeaBIOS has replaced it with a call to udelay(5)
>> instead.
>>
>> As rombios does not have an equivalent udelay(), so replace the wait with a write
>> to port 0x80 which is whilelisted by Xen for 'a small delay'.
>>                          ^
> a small typo probably.. I think it should say "whitelisted".
>
> -- Pasi
>
 D'oh - yes.  Attached is version 2 which corrects this, and another
grammatical error in the sentence.

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


[-- Attachment #2: rombios-ata_reset.patch --]
[-- Type: text/x-patch, Size: 1088 bytes --]

# HG changeset patch
# Parent 2c92985dc53fbc62d1a2975aed968a8bb021c8ef
rombios/ata: Do not wait for BSY to be set

I can't find any guarantee in the ATA specification that this will happen, and it
certainly does not with Qemu.  SeaBIOS has replaced it with a call to udelay(5)
instead.

As rombios does not have an equivalent udelay(), replace the wait with a write
to port 0x80 which is whitelisted by Xen for 'a small delay'.

This causes roughly 42k fewer traps to Qemu, which is very roughly 10% of the
number of traps during boot of a Win7 guest.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r 2c92985dc53f tools/firmware/rombios/rombios.c
--- a/tools/firmware/rombios/rombios.c
+++ b/tools/firmware/rombios/rombios.c
@@ -2914,8 +2914,8 @@ Bit16u device;
 // 8.2.1 (a) -- set SRST in DC
   outb(iobase2+ATA_CB_DC, ATA_CB_DC_HD15 | ATA_CB_DC_NIEN | ATA_CB_DC_SRST);
 
-// 8.2.1 (b) -- wait for BSY
-  await_ide(BSY, iobase1, 20);
+// 8.2.1 (b) -- wait
+  outb(0x80, 0x00);
 
 // 8.2.1 (f) -- clear SRST
   outb(iobase2+ATA_CB_DC, ATA_CB_DC_HD15 | ATA_CB_DC_NIEN);

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects
  2012-07-30 21:57   ` Alan Cox
@ 2012-07-31 10:08     ` David Vrabel
  2012-07-31 10:28     ` Andrew Cooper
  1 sibling, 0 replies; 11+ messages in thread
From: David Vrabel @ 2012-07-31 10:08 UTC (permalink / raw)
  To: Alan Cox; +Cc: Andrew Cooper, Keir Fraser, Jan Beulich, xen-devel

On 30/07/12 22:57, Alan Cox wrote:
> On Mon, 30 Jul 2012 20:47:22 +0100
> Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
>> So taking two traps when one will do is pointless.  This codepath is on the int
>> 0x13 hot path, and removing it has about a 30% reduction in the number of traps
>> to Qemu during Win7 boot.
> 
> You can't read the status for 400nS after a command issue, so throwing
> one away is a typical way to handle that.

This is only relevant when talking to real hardware, the qemu model has
no such requirement.

Also, I think you mean 400 ns not 400 nanosiemens.

> All of this is optimising the wrong thing.
> 
> The problem is that neither kvm not xen have the most basic prediction
> handlers in the kernel side exception code so keep hitting qemu.

I'd be interested in seeing how you think this will work without
knowledge of the emulated device in the hypervisor.  How does the
predictor know whether accesses have side effects?

A better solution would be to avoid most I/O accesses by the BIOS by
using PV drivers instead.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects
  2012-07-30 21:57   ` Alan Cox
  2012-07-31 10:08     ` David Vrabel
@ 2012-07-31 10:28     ` Andrew Cooper
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2012-07-31 10:28 UTC (permalink / raw)
  To: Alan Cox; +Cc: Keir (Xen.org), Jan Beulich, xen-devel@lists.xen.org

On 30/07/12 22:57, Alan Cox wrote:
> On Mon, 30 Jul 2012 20:47:22 +0100
> Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
>> So taking two traps when one will do is pointless.  This codepath is on the int
>> 0x13 hot path, and removing it has about a 30% reduction in the number of traps
>> to Qemu during Win7 boot.
> You can't read the status for 400nS after a command issue, so throwing
> one away is a typical way to handle that.

On real hardware, but virtual hardware has no such restriction.  This
version of rombios is never going to be running on real hardware.

>
> All of this is optimising the wrong thing.
>
> The problem is that neither kvm not xen have the most basic prediction
> handlers in the kernel side exception code so keep hitting qemu.
>
> For a 99% of the ATA transfers you can predict the next few in and outs
> and pre-load them into your trap handler avoiding bouncing into qemu, on
> a miss you go back into qemu and load the next prediction block (or tree
> even)
>
> That's the kind of optimisation that will really make it fly.

Quite likely, but that is a substantial architectural change, whereas
these patches are the result of a few hours of work and many hours of
testing.

>
> Alan
>

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-07-31 10:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-30 19:47 [PATCH 0 of 5] Rombios PIO performance Andrew Cooper
2012-07-30 19:47 ` [PATCH 1 of 5] rombios/keyboard: Don't needlessly poll the status register Andrew Cooper
2012-07-30 19:47 ` [PATCH 2 of 5] rombios/ata: Do not wait for BSY to be set Andrew Cooper
2012-07-30 20:18   ` Pasi Kärkkäinen
2012-07-31  9:37     ` [PATCH 2 of 5] (V2) " Andrew Cooper
2012-07-30 19:47 ` [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects Andrew Cooper
2012-07-30 21:57   ` Alan Cox
2012-07-31 10:08     ` David Vrabel
2012-07-31 10:28     ` Andrew Cooper
2012-07-30 19:47 ` [PATCH 4 of 5] rombios/ata Remove more needless traps from the int 0x13 path Andrew Cooper
2012-07-30 19:47 ` [PATCH 5 of 5] rombios/debug: Reduce verbosity of rombios Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).