qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] fw_cfg: Implement fast "blit" operation for rapidly copying in kernel, initrd [etc] into the guest
@ 2010-07-17 13:39 Richard W.M. Jones
  2010-07-17 13:40 ` [Qemu-devel] [PATCH 1/2] Don't call fw_cfg e->callback if e->callback is NULL Richard W.M. Jones
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-17 13:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, Gleb Natapov


These two patches implement a faster way to copy the kernel and initrd
data into the guest.  The guest can set up a blit address and then
issue a blit command in order to tell qemu to copy the whole of the
configuration data into the predefined area of memory.

This saves a couple of seconds of boot time when you have very large
initrd images.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into Xen guests.
http://et.redhat.com/~rjones/virt-p2v

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/2] Don't call fw_cfg e->callback if e->callback is NULL.
  2010-07-17 13:39 [Qemu-devel] [PATCH 0/2] fw_cfg: Implement fast "blit" operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
@ 2010-07-17 13:40 ` Richard W.M. Jones
  2010-07-17 13:41 ` [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, Richard W.M. Jones
  2010-07-19 10:15 ` [Qemu-devel] [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
  2 siblings, 0 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-17 13:40 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 268 bytes --]


-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora

[-- Attachment #2: 0001-Don-t-call-fw_cfg-e-callback-if-e-callback-is-NULL.patch --]
[-- Type: text/plain, Size: 1111 bytes --]

>From aded1d171a2b8f12830a43b3a0893da9acb708ff Mon Sep 17 00:00:00 2001
From: Richard Jones <rjones@redhat.com>
Date: Sat, 17 Jul 2010 14:23:21 +0100
Subject: [PATCH 1/2] Don't call fw_cfg e->callback if e->callback is NULL.

If you set up a fw_cfg writable entry without a callback, then
e->callback is still called, causing qemu to segfault.

Luckily since nothing in qemu uses writable entries at the moment,
this is not exploitable.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
---
 hw/fw_cfg.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 72866ae..37e6f1f 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -65,7 +65,8 @@ static void fw_cfg_write(FWCfgState *s, uint8_t value)
     if (s->cur_entry & FW_CFG_WRITE_CHANNEL && s->cur_offset < e->len) {
         e->data[s->cur_offset++] = value;
         if (s->cur_offset == e->len) {
-            e->callback(e->callback_opaque, e->data);
+            if (e->callback)
+                e->callback(e->callback_opaque, e->data);
             s->cur_offset = 0;
         }
     }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, ..
  2010-07-17 13:39 [Qemu-devel] [PATCH 0/2] fw_cfg: Implement fast "blit" operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
  2010-07-17 13:40 ` [Qemu-devel] [PATCH 1/2] Don't call fw_cfg e->callback if e->callback is NULL Richard W.M. Jones
@ 2010-07-17 13:41 ` Richard W.M. Jones
  2010-07-18 20:47   ` Alexander Graf
  2010-07-18 23:59   ` Aurelien Jarno
  2010-07-19 10:15 ` [Qemu-devel] [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
  2 siblings, 2 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-17 13:41 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 252 bytes --]


-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top

[-- Attachment #2: 0002-fw_cfg-Add-blit-operation-for-copying-kernel-initrd-.patch --]
[-- Type: text/plain, Size: 5674 bytes --]

>From cd167170239d60c8d13c56c881ee5f31387857af Mon Sep 17 00:00:00 2001
From: Richard Jones <rjones@redhat.com>
Date: Sat, 17 Jul 2010 14:30:46 +0100
Subject: [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, setup, cmdline.

This adds a "blit" operation for rapidly copying the kernel, initrd
etc into the guest.  Instead of doing an "inb" operation for every
byte of these images, the guest sets up a blit address and issues
a special read_fw with the FW_CFG_BLIT bit set.  This instructs
qemu to copy the whole of the image to the blit address.

With this patch, loading large initrds is several seconds faster,
and overall boot time is reduced correspondingly.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
---
 hw/fw_cfg.c                   |   19 ++++++++++++++++++-
 hw/fw_cfg.h                   |    7 +++++--
 pc-bios/optionrom/linuxboot.S |    8 ++++----
 pc-bios/optionrom/optionrom.h |   37 +++++++++++++++++++++++++++++++++++++
 4 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 37e6f1f..12206ea 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -55,6 +55,11 @@ struct FWCfgState {
     uint32_t cur_offset;
 };
 
+/* Target address for blit operations.  Only writes to lower 4GB
+ * addresses are supported .
+ */
+static uint32_t blitaddr = 0;
+
 static void fw_cfg_write(FWCfgState *s, uint8_t value)
 {
     int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
@@ -98,7 +103,17 @@ static uint8_t fw_cfg_read(FWCfgState *s)
 
     if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
         ret = 0;
-    else
+    else if (s->cur_entry & FW_CFG_BLIT) {
+        if (blitaddr == 0)
+            ret = 0; /* guest must set up a blit address beforehand */
+        else {
+            cpu_physical_memory_write ((target_phys_addr_t) blitaddr,
+                                       &e->data[s->cur_offset],
+                                       e->len - s->cur_offset);
+            s->cur_offset += e->len;
+            ret = 1;
+        }
+    } else
         ret = e->data[s->cur_offset++];
 
     FW_CFG_DPRINTF("read %d\n", ret);
@@ -351,6 +366,8 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
     fw_cfg_add_i16(s, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
     fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
     fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
+    fw_cfg_add_bytes(s, FW_CFG_BLIT_ADDR | FW_CFG_WRITE_CHANNEL,
+                     (uint8_t *)&blitaddr, sizeof blitaddr);
 
     return s;
 }
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 4d13a4f..c64f1e8 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -30,11 +30,14 @@
 
 #define FW_CFG_FILE_FIRST       0x20
 #define FW_CFG_FILE_SLOTS       0x10
-#define FW_CFG_MAX_ENTRY        (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
 
+#define FW_CFG_BLIT_ADDR        0x30
+#define FW_CFG_MAX_ENTRY        (FW_CFG_BLIT_ADDR+1)
+
+#define FW_CFG_BLIT             0x2000
 #define FW_CFG_WRITE_CHANNEL    0x4000
 #define FW_CFG_ARCH_LOCAL       0x8000
-#define FW_CFG_ENTRY_MASK       ~(FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
+#define FW_CFG_ENTRY_MASK       ~(FW_CFG_BLIT | FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
 
 #define FW_CFG_INVALID          0xffff
 
diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
index c109363..b08c69e 100644
--- a/pc-bios/optionrom/linuxboot.S
+++ b/pc-bios/optionrom/linuxboot.S
@@ -106,10 +106,10 @@ copy_kernel:
 	/* We're now running in 16-bit CS, but 32-bit ES! */
 
 	/* Load kernel and initrd */
-	read_fw_blob_addr32(FW_CFG_KERNEL)
-	read_fw_blob_addr32(FW_CFG_INITRD)
-	read_fw_blob_addr32(FW_CFG_CMDLINE)
-	read_fw_blob_addr32(FW_CFG_SETUP)
+	read_fw_blit(FW_CFG_KERNEL)
+	read_fw_blit(FW_CFG_INITRD)
+	read_fw_blit(FW_CFG_CMDLINE)
+	read_fw_blit(FW_CFG_SETUP)
 
 	/* And now jump into Linux! */
 	mov		$0, %eax
diff --git a/pc-bios/optionrom/optionrom.h b/pc-bios/optionrom/optionrom.h
index fbdd48a..a4dd49c 100644
--- a/pc-bios/optionrom/optionrom.h
+++ b/pc-bios/optionrom/optionrom.h
@@ -50,6 +50,27 @@
 	bswap		%eax
 .endm
 
+/*
+ * Write a variable from the fw_cfg device.
+ * In:          %eax
+ * Clobbers:	%edx
+ */
+.macro write_fw VAR
+        push            %eax
+        mov             $(\VAR|FW_CFG_WRITE_CHANNEL), %ax
+        mov             $BIOS_CFG_IOPORT_CFG, %dx
+        outw            %ax, (%dx)
+        pop             %eax
+        mov             $BIOS_CFG_IOPORT_DATA, %dx
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+.endm
+
 #define read_fw_blob_pre(var)				\
 	read_fw		var ## _ADDR;			\
 	mov		%eax, %edi;			\
@@ -87,6 +108,22 @@
 	*/						\
 	.dc.b		0x67,0xf3,0x6c
 
+/*
+ * Fast blit data from fw_cfg device into physical memory.
+ *
+ * Works as a much faster equivalent to read_fw_blob_add32.  Except
+ * that var##_SIZE is ignored -- instead the host always blits to
+ * the end of the available data in the requested entry.
+ */
+#define read_fw_blit(var)                               \
+        read_fw         var ## _ADDR;                   \
+        write_fw        FW_CFG_BLIT_ADDR;               \
+        mov             $(var ## _DATA|FW_CFG_BLIT), %ax; \
+        mov             $BIOS_CFG_IOPORT_CFG, %edx;     \
+        outw            %ax, (%dx);                     \
+        mov             $BIOS_CFG_IOPORT_DATA, %dx;     \
+        inb             (%dx), %al
+
 #define OPTION_ROM_START					\
     .code16;						\
     .text;						\
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, ..
  2010-07-17 13:41 ` [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, Richard W.M. Jones
@ 2010-07-18 20:47   ` Alexander Graf
  2010-07-18 21:12     ` Richard W.M. Jones
  2010-07-18 23:59   ` Aurelien Jarno
  1 sibling, 1 reply; 17+ messages in thread
From: Alexander Graf @ 2010-07-18 20:47 UTC (permalink / raw)
  To: Richard W.M.Jones; +Cc: qemu-devel


On 17.07.2010, at 15:41, Richard W.M. Jones wrote:

> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> virt-top is 'top' for virtual machines.  Tiny program with many
> powerful monitoring features, net stats, disk stats, logging, etc.
> http://et.redhat.com/~rjones/virt-top
> <0002-fw_cfg-Add-blit-operation-for-copying-kernel-initrd-.patch>

Uh - that is a full blown attachment. Nothing I can easily quote. I'll give it a try anyways by copy&pasting it...

> This adds a "blit" operation for rapidly copying the kernel, initrd
> etc into the guest.  Instead of doing an "inb" operation for every
> byte of these images, the guest sets up a blit address and issues
> a special read_fw with the FW_CFG_BLIT bit set.  This instructs
> qemu to copy the whole of the image to the blit address.
> 
> With this patch, loading large initrds is several seconds faster,
> and overall boot time is reduced correspondingly.
> 
> Signed-off-by: Richard W.M. Jones <address@hidden>
> ---
>  hw/fw_cfg.c                   |   19 ++++++++++++++++++-
>  hw/fw_cfg.h                   |    7 +++++--
>  pc-bios/optionrom/linuxboot.S |    8 ++++----
>  pc-bios/optionrom/optionrom.h |   37 +++++++++++++++++++++++++++++++++++++
>  4 files changed, 64 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> index 37e6f1f..12206ea 100644
> --- a/hw/fw_cfg.c
> +++ b/hw/fw_cfg.c
> @@ -55,6 +55,11 @@ struct FWCfgState {
>      uint32_t cur_offset;
>  };
>  
> +/* Target address for blit operations.  Only writes to lower 4GB
> + * addresses are supported .
> + */
> +static uint32_t blitaddr = 0;
> +
>  static void fw_cfg_write(FWCfgState *s, uint8_t value)
>  {
>      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> @@ -98,7 +103,17 @@ static uint8_t fw_cfg_read(FWCfgState *s)
>  
>      if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
>          ret = 0;
> -    else
> +    else if (s->cur_entry & FW_CFG_BLIT) {
> +        if (blitaddr == 0)
> +            ret = 0; /* guest must set up a blit address beforehand */

Coding style. I'm also not sure this has to be special-cased. Why care if the guest wants to corrupt its own address space?

> +        else {
> +            cpu_physical_memory_write ((target_phys_addr_t) blitaddr,
> +                                       &e->data[s->cur_offset],
> +                                       e->len - s->cur_offset);
> +            s->cur_offset += e->len;
> +            ret = 1;
> +        }
> +    } else
>          ret = e->data[s->cur_offset++];
>  
>      FW_CFG_DPRINTF("read %d\n", ret);
> @@ -351,6 +366,8 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
>      fw_cfg_add_i16(s, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
>      fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
>      fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
> +    fw_cfg_add_bytes(s, FW_CFG_BLIT_ADDR | FW_CFG_WRITE_CHANNEL,
> +                     (uint8_t *)&blitaddr, sizeof blitaddr);
>  
>      return s;
>  }
> diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
> index 4d13a4f..c64f1e8 100644
> --- a/hw/fw_cfg.h
> +++ b/hw/fw_cfg.h
> @@ -30,11 +30,14 @@
>  
>  #define FW_CFG_FILE_FIRST       0x20
>  #define FW_CFG_FILE_SLOTS       0x10
> -#define FW_CFG_MAX_ENTRY        (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
>  
> +#define FW_CFG_BLIT_ADDR        0x30
> +#define FW_CFG_MAX_ENTRY        (FW_CFG_BLIT_ADDR+1)
> +
> +#define FW_CFG_BLIT             0x2000
>  #define FW_CFG_WRITE_CHANNEL    0x4000
>  #define FW_CFG_ARCH_LOCAL       0x8000
> -#define FW_CFG_ENTRY_MASK       ~(FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
> +#define FW_CFG_ENTRY_MASK       ~(FW_CFG_BLIT | FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
>  
>  #define FW_CFG_INVALID          0xffff
>  
> diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
> index c109363..b08c69e 100644
> --- a/pc-bios/optionrom/linuxboot.S
> +++ b/pc-bios/optionrom/linuxboot.S
> @@ -106,10 +106,10 @@ copy_kernel:
>  	/* We're now running in 16-bit CS, but 32-bit ES! */
>  
>  	/* Load kernel and initrd */
> -	read_fw_blob_addr32(FW_CFG_KERNEL)
> -	read_fw_blob_addr32(FW_CFG_INITRD)
> -	read_fw_blob_addr32(FW_CFG_CMDLINE)
> -	read_fw_blob_addr32(FW_CFG_SETUP)
> +	read_fw_blit(FW_CFG_KERNEL)
> +	read_fw_blit(FW_CFG_INITRD)
> +	read_fw_blit(FW_CFG_CMDLINE)
> +	read_fw_blit(FW_CFG_SETUP)

No. The interface should be generic. What I envision is an interface that exposes all variables via DMA. You set up the address you DMA to, the length of the DMA, the field you want to DMA and submit a GO command. After that, the variable is in guest address space.

>  
>  	/* And now jump into Linux! */
>  	mov		$0, %eax
> diff --git a/pc-bios/optionrom/optionrom.h b/pc-bios/optionrom/optionrom.h

Whenever touching common code, please make sure that multiboot also still works. The easiest test case there is Xen.

> +/*
> + * Fast blit data from fw_cfg device into physical memory.
> + *
> + * Works as a much faster equivalent to read_fw_blob_add32.  Except
> + * that var##_SIZE is ignored -- instead the host always blits to
> + * the end of the available data in the requested entry.

The length should be guest defined.


Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, ..
  2010-07-18 20:47   ` Alexander Graf
@ 2010-07-18 21:12     ` Richard W.M. Jones
  2010-07-18 21:13       ` Alexander Graf
  0 siblings, 1 reply; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-18 21:12 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel

On Sun, Jul 18, 2010 at 10:47:27PM +0200, Alexander Graf wrote:
> > +        if (blitaddr == 0)
> > +            ret = 0; /* guest must set up a blit address beforehand */
> 
> Coding style. I'm also not sure this has to be special-cased. Why care if the guest wants to corrupt its own address space?
> 

I agree that we don't need to special-case this.  But what was the
coding style issue? (the comment?)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, ..
  2010-07-18 21:12     ` Richard W.M. Jones
@ 2010-07-18 21:13       ` Alexander Graf
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Graf @ 2010-07-18 21:13 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: qemu-devel


On 18.07.2010, at 23:12, Richard W.M. Jones wrote:

> On Sun, Jul 18, 2010 at 10:47:27PM +0200, Alexander Graf wrote:
>>> +        if (blitaddr == 0)
>>> +            ret = 0; /* guest must set up a blit address beforehand */
>> 
>> Coding style. I'm also not sure this has to be special-cased. Why care if the guest wants to corrupt its own address space?
>> 
> 
> I agree that we don't need to special-case this.  But what was the
> coding style issue? (the comment?)

Qemu's CODING_STYLE file requires you to do things like this:

if (x) {
    foo();
} else {
    bar();
}

Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, ..
  2010-07-17 13:41 ` [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, Richard W.M. Jones
  2010-07-18 20:47   ` Alexander Graf
@ 2010-07-18 23:59   ` Aurelien Jarno
  2010-07-19  7:23     ` Richard W.M. Jones
  1 sibling, 1 reply; 17+ messages in thread
From: Aurelien Jarno @ 2010-07-18 23:59 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: qemu-devel

On Sat, Jul 17, 2010 at 02:41:04PM +0100, Richard W.M. Jones wrote:
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> virt-top is 'top' for virtual machines.  Tiny program with many
> powerful monitoring features, net stats, disk stats, logging, etc.
> http://et.redhat.com/~rjones/virt-top

> >From cd167170239d60c8d13c56c881ee5f31387857af Mon Sep 17 00:00:00 2001
> From: Richard Jones <rjones@redhat.com>
> Date: Sat, 17 Jul 2010 14:30:46 +0100
> Subject: [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, setup, cmdline.
> 
> This adds a "blit" operation for rapidly copying the kernel, initrd
> etc into the guest.  Instead of doing an "inb" operation for every
> byte of these images, the guest sets up a blit address and issues
> a special read_fw with the FW_CFG_BLIT bit set.  This instructs
> qemu to copy the whole of the image to the blit address.
> 
> With this patch, loading large initrds is several seconds faster,
> and overall boot time is reduced correspondingly.
> 
> Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
> ---
>  hw/fw_cfg.c                   |   19 ++++++++++++++++++-
>  hw/fw_cfg.h                   |    7 +++++--
>  pc-bios/optionrom/linuxboot.S |    8 ++++----
>  pc-bios/optionrom/optionrom.h |   37 +++++++++++++++++++++++++++++++++++++
>  4 files changed, 64 insertions(+), 7 deletions(-)

OpenBIOS also uses the same firmware interface, so it would need to be
changed if this patch is accepted.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, ..
  2010-07-18 23:59   ` Aurelien Jarno
@ 2010-07-19  7:23     ` Richard W.M. Jones
  2010-07-19  9:19       ` Aurelien Jarno
  0 siblings, 1 reply; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-19  7:23 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On Mon, Jul 19, 2010 at 01:59:22AM +0200, Aurelien Jarno wrote:
> OpenBIOS also uses the same firmware interface, so it would need to be
> changed if this patch is accepted.

The patch leaves the old interface.  Does it still need to be changed?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, ..
  2010-07-19  7:23     ` Richard W.M. Jones
@ 2010-07-19  9:19       ` Aurelien Jarno
  0 siblings, 0 replies; 17+ messages in thread
From: Aurelien Jarno @ 2010-07-19  9:19 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: qemu-devel

On Mon, Jul 19, 2010 at 08:23:34AM +0100, Richard W.M. Jones wrote:
> On Mon, Jul 19, 2010 at 01:59:22AM +0200, Aurelien Jarno wrote:
> > OpenBIOS also uses the same firmware interface, so it would need to be
> > changed if this patch is accepted.
> 
> The patch leaves the old interface.  Does it still need to be changed?
> 

As long as the old interface is kept, that should be fine for OpenBIOS.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest
  2010-07-17 13:39 [Qemu-devel] [PATCH 0/2] fw_cfg: Implement fast "blit" operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
  2010-07-17 13:40 ` [Qemu-devel] [PATCH 1/2] Don't call fw_cfg e->callback if e->callback is NULL Richard W.M. Jones
  2010-07-17 13:41 ` [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, Richard W.M. Jones
@ 2010-07-19 10:15 ` Richard W.M. Jones
  2010-07-19 10:15   ` [Qemu-devel] [PATCH 1/2 version 2] Don't call fw_cfg e->callback if e->callback is NULL Richard W.M. Jones
                     ` (3 more replies)
  2 siblings, 4 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-19 10:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, Gleb Natapov


This is the second version of the patch.

We don't use the word "blit" any more, instead this is replaced with
"DMA", even though it's not quite like a DMA operation on physical
hardware.

The guest writes the physical address and size to two 32 bit fw_cfg
variables.  Then when the guest issues an ordinary read operation with
the extra FW_CFG_DMA flag set, instead of returning a single byte,
qemu "DMA"s the requested data into the guest memory.

The guest shouldn't be able to request a dma_size larger than the
amount of data in the entry.  The patch checks this and adjusts
dma_size.

The guest might select a dma_addr which does not correspond to
physical memory (or dma_addr + dma_size).  Reading the code it seems
to be that cpu_physical_memory_write catches this case and will
abort() (so the guest is only harming itself).  However I'd quite like
an expert opinion on this ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/2 version 2] Don't call fw_cfg e->callback if e->callback is NULL.
  2010-07-19 10:15 ` [Qemu-devel] [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
@ 2010-07-19 10:15   ` Richard W.M. Jones
  2010-07-19 10:16   ` [Qemu-devel] [PATCH 2/2 version 2] fw_cfg: Allow guest to read kernel etc via fast, synchronous "DMA"-type operation Richard W.M. Jones
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-19 10:15 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 298 bytes --]


-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://et.redhat.com/~rjones/libguestfs/
See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html

[-- Attachment #2: 0001-Don-t-call-fw_cfg-e-callback-if-e-callback-is-NULL.patch --]
[-- Type: text/plain, Size: 1111 bytes --]

>From 1fe39da6476a6ff05e9705cd7f63f94c93746053 Mon Sep 17 00:00:00 2001
From: Richard Jones <rjones@redhat.com>
Date: Sat, 17 Jul 2010 14:23:21 +0100
Subject: [PATCH 1/2] Don't call fw_cfg e->callback if e->callback is NULL.

If you set up a fw_cfg writable entry without a callback, then
e->callback is still called, causing qemu to segfault.

Luckily since nothing in qemu uses writable entries at the moment,
this is not exploitable.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
---
 hw/fw_cfg.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 72866ae..37e6f1f 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -65,7 +65,8 @@ static void fw_cfg_write(FWCfgState *s, uint8_t value)
     if (s->cur_entry & FW_CFG_WRITE_CHANNEL && s->cur_offset < e->len) {
         e->data[s->cur_offset++] = value;
         if (s->cur_offset == e->len) {
-            e->callback(e->callback_opaque, e->data);
+            if (e->callback)
+                e->callback(e->callback_opaque, e->data);
             s->cur_offset = 0;
         }
     }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/2 version 2] fw_cfg: Allow guest to read kernel etc via fast, synchronous "DMA"-type operation.
  2010-07-19 10:15 ` [Qemu-devel] [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
  2010-07-19 10:15   ` [Qemu-devel] [PATCH 1/2 version 2] Don't call fw_cfg e->callback if e->callback is NULL Richard W.M. Jones
@ 2010-07-19 10:16   ` Richard W.M. Jones
  2010-07-19 10:45   ` [Qemu-devel] Re: [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Gleb Natapov
  2010-07-19 11:30   ` [Qemu-devel] [PATCH 2/2 version 3] " Richard W.M. Jones
  3 siblings, 0 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-19 10:16 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 253 bytes --]


-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/

[-- Attachment #2: 0002-fw_cfg-Allow-guest-to-read-kernel-etc-via-fast-synch.patch --]
[-- Type: text/plain, Size: 5947 bytes --]

>From 55d4700262253da52aa403dc1ba68da2ae91b084 Mon Sep 17 00:00:00 2001
From: Richard Jones <rjones@redhat.com>
Date: Sat, 17 Jul 2010 14:30:46 +0100
Subject: [PATCH 2/2] fw_cfg: Allow guest to read kernel etc via fast, synchronous "DMA"-type operation.

This adds a "DMA" operation for rapidly copying the kernel, initrd
etc into the guest.

The guest sets up a DMA address and size and then issues the usual
read operation but with the FW_CFG_DMA bit set on the entry number.
QEmu then just copies the whole config entry to the selected physical
address synchronously.

This saves some time when loading large images.

This change is backwards compatible.  ROMs using the old method will
work unchanged.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
---
 hw/fw_cfg.c                   |   22 +++++++++++++++++++++-
 hw/fw_cfg.h                   |    8 ++++++--
 pc-bios/optionrom/linuxboot.S |    8 ++++----
 pc-bios/optionrom/optionrom.h |   37 +++++++++++++++++++++++++++++++++++++
 4 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 37e6f1f..798a332 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -55,6 +55,13 @@ struct FWCfgState {
     uint32_t cur_offset;
 };
 
+/* Target address and size for DMA operations.  This is only used
+ * during boot and across 32 and 64 bit architectures, so only writes
+ * to lower 4GB addresses are supported.
+ */
+static uint32_t dma_addr = 0;
+static uint32_t dma_size = 0;
+
 static void fw_cfg_write(FWCfgState *s, uint8_t value)
 {
     int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
@@ -98,7 +105,16 @@ static uint8_t fw_cfg_read(FWCfgState *s)
 
     if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
         ret = 0;
-    else
+    else if (s->cur_entry & FW_CFG_DMA) {
+        if (dma_size > e->len - s->cur_offset)
+            dma_size = e->len - s->cur_offset;
+
+        cpu_physical_memory_write ((target_phys_addr_t) dma_addr,
+                                   &e->data[s->cur_offset],
+                                   dma_size);
+        s->cur_offset += e->len;
+        ret = 1;
+    } else
         ret = e->data[s->cur_offset++];
 
     FW_CFG_DPRINTF("read %d\n", ret);
@@ -351,6 +367,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
     fw_cfg_add_i16(s, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
     fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
     fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
+    fw_cfg_add_bytes(s, FW_CFG_DMA_ADDR | FW_CFG_WRITE_CHANNEL,
+                     (uint8_t *)&dma_addr, sizeof dma_addr);
+    fw_cfg_add_bytes(s, FW_CFG_DMA_SIZE | FW_CFG_WRITE_CHANNEL,
+                     (uint8_t *)&dma_size, sizeof dma_size);
 
     return s;
 }
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 4d13a4f..44b2be5 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -30,11 +30,15 @@
 
 #define FW_CFG_FILE_FIRST       0x20
 #define FW_CFG_FILE_SLOTS       0x10
-#define FW_CFG_MAX_ENTRY        (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
 
+#define FW_CFG_DMA_ADDR         0x30
+#define FW_CFG_DMA_SIZE         0x31
+#define FW_CFG_MAX_ENTRY        (FW_CFG_DMA_SIZE+1)
+
+#define FW_CFG_DMA              0x2000
 #define FW_CFG_WRITE_CHANNEL    0x4000
 #define FW_CFG_ARCH_LOCAL       0x8000
-#define FW_CFG_ENTRY_MASK       ~(FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
+#define FW_CFG_ENTRY_MASK       ~(FW_CFG_DMA | FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
 
 #define FW_CFG_INVALID          0xffff
 
diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
index c109363..dbf44cb 100644
--- a/pc-bios/optionrom/linuxboot.S
+++ b/pc-bios/optionrom/linuxboot.S
@@ -106,10 +106,10 @@ copy_kernel:
 	/* We're now running in 16-bit CS, but 32-bit ES! */
 
 	/* Load kernel and initrd */
-	read_fw_blob_addr32(FW_CFG_KERNEL)
-	read_fw_blob_addr32(FW_CFG_INITRD)
-	read_fw_blob_addr32(FW_CFG_CMDLINE)
-	read_fw_blob_addr32(FW_CFG_SETUP)
+	read_fw_blob_dma(FW_CFG_KERNEL)
+	read_fw_blob_dma(FW_CFG_INITRD)
+	read_fw_blob_dma(FW_CFG_CMDLINE)
+	read_fw_blob_dma(FW_CFG_SETUP)
 
 	/* And now jump into Linux! */
 	mov		$0, %eax
diff --git a/pc-bios/optionrom/optionrom.h b/pc-bios/optionrom/optionrom.h
index fbdd48a..7fffe2d 100644
--- a/pc-bios/optionrom/optionrom.h
+++ b/pc-bios/optionrom/optionrom.h
@@ -50,6 +50,27 @@
 	bswap		%eax
 .endm
 
+/*
+ * Write %eax to a variable in the fw_cfg device.
+ * In:          %eax
+ * Clobbers:	%edx
+ */
+.macro write_fw VAR
+        push            %eax
+        mov             $(\VAR|FW_CFG_WRITE_CHANNEL), %ax
+        mov             $BIOS_CFG_IOPORT_CFG, %dx
+        outw            %ax, (%dx)
+        pop             %eax
+        mov             $BIOS_CFG_IOPORT_DATA, %dx
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+.endm
+
 #define read_fw_blob_pre(var)				\
 	read_fw		var ## _ADDR;			\
 	mov		%eax, %edi;			\
@@ -87,6 +108,22 @@
 	*/						\
 	.dc.b		0x67,0xf3,0x6c
 
+/*
+ * Fast DMA of data from fw_cfg device into physical memory.
+ * This should be a straight replacement for read_fw_blob and
+ * read_fw_blob_addr32.
+ */
+#define read_fw_blob_dma(var)                           \
+        read_fw         var ## _ADDR;                   \
+        write_fw        FW_CFG_DMA_ADDR;                \
+        read_fw         var ## _SIZE;                   \
+        write_fw        FW_CFG_DMA_SIZE;                \
+        mov             $(var ## _DATA|FW_CFG_DMA), %ax; \
+        mov             $BIOS_CFG_IOPORT_CFG, %edx;     \
+        outw            %ax, (%dx);                     \
+        mov             $BIOS_CFG_IOPORT_DATA, %dx;     \
+        inb             (%dx), %al
+
 #define OPTION_ROM_START					\
     .code16;						\
     .text;						\
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] Re: [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest
  2010-07-19 10:15 ` [Qemu-devel] [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
  2010-07-19 10:15   ` [Qemu-devel] [PATCH 1/2 version 2] Don't call fw_cfg e->callback if e->callback is NULL Richard W.M. Jones
  2010-07-19 10:16   ` [Qemu-devel] [PATCH 2/2 version 2] fw_cfg: Allow guest to read kernel etc via fast, synchronous "DMA"-type operation Richard W.M. Jones
@ 2010-07-19 10:45   ` Gleb Natapov
  2010-07-19 10:49     ` Richard W.M. Jones
  2010-07-19 11:30   ` [Qemu-devel] [PATCH 2/2 version 3] " Richard W.M. Jones
  3 siblings, 1 reply; 17+ messages in thread
From: Gleb Natapov @ 2010-07-19 10:45 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: qemu-devel, agraf

On Mon, Jul 19, 2010 at 11:15:04AM +0100, Richard W.M. Jones wrote:
> 
> This is the second version of the patch.
> 
> We don't use the word "blit" any more, instead this is replaced with
> "DMA", even though it's not quite like a DMA operation on physical
> hardware.
> 
You ignored the whole discussion above. Calling things DMA will not make
them so. You haven't event implemented Alexander's suggestion to poll
for DMA completion which will at lease make the interface to the guest
palatable.
 
> The guest writes the physical address and size to two 32 bit fw_cfg
> variables.  Then when the guest issues an ordinary read operation with
> the extra FW_CFG_DMA flag set, instead of returning a single byte,
> qemu "DMA"s the requested data into the guest memory.
> 
> The guest shouldn't be able to request a dma_size larger than the
> amount of data in the entry.  The patch checks this and adjusts
> dma_size.
> 
> The guest might select a dma_addr which does not correspond to
> physical memory (or dma_addr + dma_size).  Reading the code it seems
> to be that cpu_physical_memory_write catches this case and will
> abort() (so the guest is only harming itself).  However I'd quite like
> an expert opinion on this ...
> 
> Rich.
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> virt-top is 'top' for virtual machines.  Tiny program with many
> powerful monitoring features, net stats, disk stats, logging, etc.
> http://et.redhat.com/~rjones/virt-top

--
			Gleb.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] Re: [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest
  2010-07-19 10:45   ` [Qemu-devel] Re: [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Gleb Natapov
@ 2010-07-19 10:49     ` Richard W.M. Jones
  2010-07-19 10:56       ` Gleb Natapov
  0 siblings, 1 reply; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-19 10:49 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: qemu-devel, agraf

On Mon, Jul 19, 2010 at 01:45:00PM +0300, Gleb Natapov wrote:
> On Mon, Jul 19, 2010 at 11:15:04AM +0100, Richard W.M. Jones wrote:
> > 
> > This is the second version of the patch.
> > 
> > We don't use the word "blit" any more, instead this is replaced with
> > "DMA", even though it's not quite like a DMA operation on physical
> > hardware.
> > 
> You ignored the whole discussion above. Calling things DMA will not make
> them so. You haven't event implemented Alexander's suggestion to poll
> for DMA completion which will at lease make the interface to the guest
> palatable.

I read everything in the discussion.

I can add polling however.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
New in Fedora 11: Fedora Windows cross-compiler. Compile Windows
programs, test, and build Windows installers. Over 70 libraries supprt'd
http://fedoraproject.org/wiki/MinGW http://www.annexia.org/fedora_mingw

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] Re: [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest
  2010-07-19 10:49     ` Richard W.M. Jones
@ 2010-07-19 10:56       ` Gleb Natapov
  0 siblings, 0 replies; 17+ messages in thread
From: Gleb Natapov @ 2010-07-19 10:56 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: qemu-devel, agraf

On Mon, Jul 19, 2010 at 11:49:09AM +0100, Richard W.M. Jones wrote:
> On Mon, Jul 19, 2010 at 01:45:00PM +0300, Gleb Natapov wrote:
> > On Mon, Jul 19, 2010 at 11:15:04AM +0100, Richard W.M. Jones wrote:
> > > 
> > > This is the second version of the patch.
> > > 
> > > We don't use the word "blit" any more, instead this is replaced with
> > > "DMA", even though it's not quite like a DMA operation on physical
> > > hardware.
> > > 
> > You ignored the whole discussion above. Calling things DMA will not make
> > them so. You haven't event implemented Alexander's suggestion to poll
> > for DMA completion which will at lease make the interface to the guest
> > palatable.
> 
> I read everything in the discussion.
> 
> I can add polling however.
> 
If copying (call to cpu_physical_memory_write()) really takes 6 or more
seconds we should really make it async from the beginning. (If we are going
this way at all. I prefer to use virtio-serial for such complex gust/host
communication. fw_cfg was designed to be simple at should stay so. And
it is used by other arches too so any extension should be usable there).

--
			Gleb.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/2 version 3] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest
  2010-07-19 10:15 ` [Qemu-devel] [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
                     ` (2 preceding siblings ...)
  2010-07-19 10:45   ` [Qemu-devel] Re: [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Gleb Natapov
@ 2010-07-19 11:30   ` Richard W.M. Jones
  2010-07-20 14:41     ` [Qemu-devel] [PATCH 2/2 version 4] " Richard W.M. Jones
  3 siblings, 1 reply; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-19 11:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, Gleb Natapov

[-- Attachment #1: Type: text/plain, Size: 434 bytes --]


This version adds polling for DMA done.

Part 1/2 (the fix for e->callback == NULL) is the same as always so
I won't attach it.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://et.redhat.com/~rjones/libguestfs/
See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html

[-- Attachment #2: 0002-fw_cfg-Allow-guest-to-read-kernel-etc-via-fast-synch.patch --]
[-- Type: text/plain, Size: 7030 bytes --]

>From 449eeef3dedb5612fe4f408835bbd926643d35f8 Mon Sep 17 00:00:00 2001
From: Richard Jones <rjones@redhat.com>
Date: Sat, 17 Jul 2010 14:30:46 +0100
Subject: [PATCH 2/2] fw_cfg: Allow guest to read kernel etc via fast, synchronous "DMA"-type operation.

This adds a "DMA" operation for rapidly copying the kernel, initrd
etc into the guest.

The guest sets up a DMA address and size and then issues the usual
read operation but with the FW_CFG_DMA bit set on the entry number.
QEmu then just copies the whole config entry to the selected physical
address synchronously.  The guest should poll until this "DMA"
operation is done, allowing us to write an alternate asynchronous
version in future should that be necessary.

This saves some time when loading large images.

This change is backwards compatible.  ROMs using the old method will
work unchanged.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
---
 hw/fw_cfg.c                   |   35 +++++++++++++++++++++++++++++++++-
 hw/fw_cfg.h                   |   10 +++++++-
 pc-bios/optionrom/linuxboot.S |    8 +++---
 pc-bios/optionrom/optionrom.h |   42 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 37e6f1f..383586f 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -55,6 +55,13 @@ struct FWCfgState {
     uint32_t cur_offset;
 };
 
+/* Target address and size for DMA operations.  This is only used
+ * during boot and across 32 and 64 bit architectures, so only writes
+ * to lower 4GB addresses are supported.
+ */
+static uint32_t dma_addr = 0;
+static uint32_t dma_size = 0;
+
 static void fw_cfg_write(FWCfgState *s, uint8_t value)
 {
     int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
@@ -98,7 +105,22 @@ static uint8_t fw_cfg_read(FWCfgState *s)
 
     if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
         ret = 0;
-    else
+    else if (s->cur_entry & FW_CFG_DMA) {
+        if (dma_size > e->len - s->cur_offset)
+            dma_size = e->len - s->cur_offset;
+
+        cpu_physical_memory_write ((target_phys_addr_t) dma_addr,
+                                   &e->data[s->cur_offset],
+                                   dma_size);
+        s->cur_offset += e->len;
+        /* Returns 0 if there was an error, 1 if the DMA operation
+         * started.  Callers *must* poll for completion of the
+         * operation (waiting for FW_CFG_DMA_DONE == 0), even though
+         * in the current implementation the operation completes
+         * instantaneously from the p.o.v of the current guest vCPU.
+         */
+        ret = 1;
+    } else
         ret = e->data[s->cur_offset++];
 
     FW_CFG_DPRINTF("read %d\n", ret);
@@ -352,6 +374,17 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
     fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
     fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
 
+    fw_cfg_add_bytes(s, FW_CFG_DMA_ADDR | FW_CFG_WRITE_CHANNEL,
+                     (uint8_t *)&dma_addr, sizeof dma_addr);
+    fw_cfg_add_bytes(s, FW_CFG_DMA_SIZE | FW_CFG_WRITE_CHANNEL,
+                     (uint8_t *)&dma_size, sizeof dma_size);
+    /* Current implementation is synchronous, so this value always reads
+     * as 0 (meaning "done").  In other possible implementations, this
+     * could return > 0 indicating that the caller should continue polling
+     * for completion of the operation.
+     */
+    fw_cfg_add_i32(s, FW_CFG_DMA_DONE, 0);
+
     return s;
 }
 
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 4d13a4f..abc41c9 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -30,11 +30,17 @@
 
 #define FW_CFG_FILE_FIRST       0x20
 #define FW_CFG_FILE_SLOTS       0x10
-#define FW_CFG_MAX_ENTRY        (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
 
+#define FW_CFG_DMA_ADDR         0x30
+#define FW_CFG_DMA_SIZE         0x31
+#define FW_CFG_DMA_DONE         0x32
+
+#define FW_CFG_MAX_ENTRY        (FW_CFG_DMA_DONE+1)
+
+#define FW_CFG_DMA              0x2000
 #define FW_CFG_WRITE_CHANNEL    0x4000
 #define FW_CFG_ARCH_LOCAL       0x8000
-#define FW_CFG_ENTRY_MASK       ~(FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
+#define FW_CFG_ENTRY_MASK       ~(FW_CFG_DMA | FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
 
 #define FW_CFG_INVALID          0xffff
 
diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
index c109363..dbf44cb 100644
--- a/pc-bios/optionrom/linuxboot.S
+++ b/pc-bios/optionrom/linuxboot.S
@@ -106,10 +106,10 @@ copy_kernel:
 	/* We're now running in 16-bit CS, but 32-bit ES! */
 
 	/* Load kernel and initrd */
-	read_fw_blob_addr32(FW_CFG_KERNEL)
-	read_fw_blob_addr32(FW_CFG_INITRD)
-	read_fw_blob_addr32(FW_CFG_CMDLINE)
-	read_fw_blob_addr32(FW_CFG_SETUP)
+	read_fw_blob_dma(FW_CFG_KERNEL)
+	read_fw_blob_dma(FW_CFG_INITRD)
+	read_fw_blob_dma(FW_CFG_CMDLINE)
+	read_fw_blob_dma(FW_CFG_SETUP)
 
 	/* And now jump into Linux! */
 	mov		$0, %eax
diff --git a/pc-bios/optionrom/optionrom.h b/pc-bios/optionrom/optionrom.h
index fbdd48a..237f71e 100644
--- a/pc-bios/optionrom/optionrom.h
+++ b/pc-bios/optionrom/optionrom.h
@@ -50,6 +50,27 @@
 	bswap		%eax
 .endm
 
+/*
+ * Write %eax to a variable in the fw_cfg device.
+ * In:          %eax
+ * Clobbers:	%edx
+ */
+.macro write_fw VAR
+        push            %eax
+        mov             $(\VAR|FW_CFG_WRITE_CHANNEL), %ax
+        mov             $BIOS_CFG_IOPORT_CFG, %dx
+        outw            %ax, (%dx)
+        pop             %eax
+        mov             $BIOS_CFG_IOPORT_DATA, %dx
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+.endm
+
 #define read_fw_blob_pre(var)				\
 	read_fw		var ## _ADDR;			\
 	mov		%eax, %edi;			\
@@ -87,6 +108,27 @@
 	*/						\
 	.dc.b		0x67,0xf3,0x6c
 
+/*
+ * Fast DMA of data from fw_cfg device into physical memory.
+ * This should be a straight replacement for read_fw_blob and
+ * read_fw_blob_addr32.
+ */
+#define read_fw_blob_dma(var)                           \
+        read_fw         var ## _ADDR;                   \
+        write_fw        FW_CFG_DMA_ADDR;                \
+        read_fw         var ## _SIZE;                   \
+        write_fw        FW_CFG_DMA_SIZE;                \
+        mov             $(var ## _DATA|FW_CFG_DMA), %ax; \
+        mov             $BIOS_CFG_IOPORT_CFG, %edx;     \
+        outw            %ax, (%dx);                     \
+        mov             $BIOS_CFG_IOPORT_DATA, %dx;     \
+        inb             (%dx), %al;                     \
+     1: test            %al, %al;                       \
+        jz              1f;                             \
+        read_fw         FW_CFG_DMA_DONE;                \
+        jmp 1b;                                         \
+     1:
+
 #define OPTION_ROM_START					\
     .code16;						\
     .text;						\
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/2 version 4] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest
  2010-07-19 11:30   ` [Qemu-devel] [PATCH 2/2 version 3] " Richard W.M. Jones
@ 2010-07-20 14:41     ` Richard W.M. Jones
  0 siblings, 0 replies; 17+ messages in thread
From: Richard W.M. Jones @ 2010-07-20 14:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]


This version adds a FW_CFG_FEATURES interface which the option ROM may
use to find out if qemu supports the "DMA"-like feature.  And the
linuxboot.bin option ROM has been changed to use this interface.

As previously, the 1/2 part of the patch is unchanged.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
New in Fedora 11: Fedora Windows cross-compiler. Compile Windows
programs, test, and build Windows installers. Over 70 libraries supprt'd
http://fedoraproject.org/wiki/MinGW http://www.annexia.org/fedora_mingw

[-- Attachment #2: 0002-fw_cfg-Allow-guest-to-read-kernel-etc-via-fast-synch.patch --]
[-- Type: text/plain, Size: 8086 bytes --]

>From 10749ad35708bbd705be489303cb1cc39224e25c Mon Sep 17 00:00:00 2001
From: Richard Jones <rjones@redhat.com>
Date: Sat, 17 Jul 2010 14:30:46 +0100
Subject: [PATCH 2/2] fw_cfg: Allow guest to read kernel etc via fast, synchronous "DMA"-type operation.

This adds a "DMA" operation for rapidly copying the kernel, initrd
etc into the guest.  This saves time when loading large images.

The guest sets up a DMA address and size and then issues the usual
read operation but with the FW_CFG_DMA bit set on the entry number.
QEmu then just copies the whole config entry to the selected physical
address synchronously.  The guest should poll until this "DMA"
operation is done, allowing us to write an alternate asynchronous
version in future should that be necessary.

This change is backwards compatible.  ROMs using the old method will
work unchanged.

The change is discoverable.  ROMs may read the FW_CFG_FEATURES bitmask
to find out if a version of qemu has this feature.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
---
 hw/fw_cfg.c                   |   37 ++++++++++++++++++++++++++++++++++-
 hw/fw_cfg.h                   |   20 ++++++++++++++++--
 pc-bios/optionrom/linuxboot.S |   11 +++++++++-
 pc-bios/optionrom/optionrom.h |   43 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 37e6f1f..a881bd5 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -55,6 +55,13 @@ struct FWCfgState {
     uint32_t cur_offset;
 };
 
+/* Target address and size for DMA operations.  This is only used
+ * during boot and across 32 and 64 bit architectures, so only writes
+ * to lower 4GB addresses are supported.
+ */
+static uint32_t dma_addr = 0;
+static uint32_t dma_size = 0;
+
 static void fw_cfg_write(FWCfgState *s, uint8_t value)
 {
     int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
@@ -98,7 +105,22 @@ static uint8_t fw_cfg_read(FWCfgState *s)
 
     if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
         ret = 0;
-    else
+    else if (s->cur_entry & FW_CFG_DMA) {
+        if (dma_size > e->len - s->cur_offset)
+            dma_size = e->len - s->cur_offset;
+
+        cpu_physical_memory_write ((target_phys_addr_t) dma_addr,
+                                   &e->data[s->cur_offset],
+                                   dma_size);
+        s->cur_offset += e->len;
+        /* Returns 0 if there was an error, 1 if the DMA operation
+         * started.  Callers *must* poll for completion of the
+         * operation (waiting for FW_CFG_DMA_DONE == 0), even though
+         * in the current implementation the operation completes
+         * instantaneously from the p.o.v of the current guest vCPU.
+         */
+        ret = 1;
+    } else
         ret = e->data[s->cur_offset++];
 
     FW_CFG_DPRINTF("read %d\n", ret);
@@ -352,6 +374,19 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
     fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
     fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
 
+    fw_cfg_add_i32(s, FW_CFG_FEATURES, FW_CFG_FEATURES_DMA);
+
+    fw_cfg_add_bytes(s, FW_CFG_DMA_ADDR | FW_CFG_WRITE_CHANNEL,
+                     (uint8_t *)&dma_addr, sizeof dma_addr);
+    fw_cfg_add_bytes(s, FW_CFG_DMA_SIZE | FW_CFG_WRITE_CHANNEL,
+                     (uint8_t *)&dma_size, sizeof dma_size);
+    /* Current implementation is synchronous, so this value always reads
+     * as 0 (meaning "done").  In other possible implementations, this
+     * could return > 0 indicating that the caller should continue polling
+     * for completion of the operation.
+     */
+    fw_cfg_add_i32(s, FW_CFG_DMA_DONE, 0);
+
     return s;
 }
 
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 4d13a4f..31ee4ff 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -27,17 +27,31 @@
 #define FW_CFG_SETUP_SIZE       0x17
 #define FW_CFG_SETUP_DATA       0x18
 #define FW_CFG_FILE_DIR         0x19
-
 #define FW_CFG_FILE_FIRST       0x20
 #define FW_CFG_FILE_SLOTS       0x10
-#define FW_CFG_MAX_ENTRY        (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
+#define FW_CFG_FEATURES         0x30
+#define FW_CFG_DMA_ADDR         0x31
+#define FW_CFG_DMA_SIZE         0x32
+#define FW_CFG_DMA_DONE         0x33
+
+#define FW_CFG_MAX_ENTRY        (FW_CFG_DMA_DONE+1)
 
+#define FW_CFG_DMA              0x2000
 #define FW_CFG_WRITE_CHANNEL    0x4000
 #define FW_CFG_ARCH_LOCAL       0x8000
-#define FW_CFG_ENTRY_MASK       ~(FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
+#define FW_CFG_ENTRY_MASK       ~(FW_CFG_DMA | FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL)
 
 #define FW_CFG_INVALID          0xffff
 
+/* Bitmask of features returned by reading FW_CFG_FEATURES entry.
+ * Older versions of qemu didn't implement this, but they returned 0
+ * when you read an invalid entry.  So 0 indicates that only the basic
+ * features < entry 0x30 are supported (in reality it's not well
+ * defined because features were incrementally added over time with no
+ * feature detection interface).
+ */
+#define FW_CFG_FEATURES_DMA     0x00000001
+
 #ifndef NO_QEMU_PROTOS
 typedef struct FWCfgFile {
     uint32_t  size;        /* file size */
diff --git a/pc-bios/optionrom/linuxboot.S b/pc-bios/optionrom/linuxboot.S
index c109363..d86973a 100644
--- a/pc-bios/optionrom/linuxboot.S
+++ b/pc-bios/optionrom/linuxboot.S
@@ -106,11 +106,20 @@ copy_kernel:
 	/* We're now running in 16-bit CS, but 32-bit ES! */
 
 	/* Load kernel and initrd */
-	read_fw_blob_addr32(FW_CFG_KERNEL)
+	read_fw         FW_CFG_FEATURES
+	andl            $FW_CFG_FEATURES_DMA, %eax
+	jz no_dma
+	read_fw_blob_dma(FW_CFG_KERNEL)
+	read_fw_blob_dma(FW_CFG_INITRD)
+	read_fw_blob_dma(FW_CFG_CMDLINE)
+	read_fw_blob_dma(FW_CFG_SETUP)
+	jz jmp_linux
+no_dma:	read_fw_blob_addr32(FW_CFG_KERNEL)
 	read_fw_blob_addr32(FW_CFG_INITRD)
 	read_fw_blob_addr32(FW_CFG_CMDLINE)
 	read_fw_blob_addr32(FW_CFG_SETUP)
 
+jmp_linux:
 	/* And now jump into Linux! */
 	mov		$0, %eax
 	mov		%eax, %cr0
diff --git a/pc-bios/optionrom/optionrom.h b/pc-bios/optionrom/optionrom.h
index fbdd48a..b8c2a76 100644
--- a/pc-bios/optionrom/optionrom.h
+++ b/pc-bios/optionrom/optionrom.h
@@ -50,6 +50,27 @@
 	bswap		%eax
 .endm
 
+/*
+ * Write %eax to a variable in the fw_cfg device.
+ * In:          %eax
+ * Clobbers:	%edx
+ */
+.macro write_fw VAR
+        push            %eax
+        mov             $(\VAR|FW_CFG_WRITE_CHANNEL), %ax
+        mov             $BIOS_CFG_IOPORT_CFG, %dx
+        outw            %ax, (%dx)
+        pop             %eax
+        mov             $BIOS_CFG_IOPORT_DATA, %dx
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+        shr             $8, %eax
+        outb            %al, (%dx)
+.endm
+
 #define read_fw_blob_pre(var)				\
 	read_fw		var ## _ADDR;			\
 	mov		%eax, %edi;			\
@@ -87,6 +108,28 @@
 	*/						\
 	.dc.b		0x67,0xf3,0x6c
 
+/*
+ * Fast DMA of data from fw_cfg device into physical memory.
+ *
+ * Check FW_CFG_FEATURES contains FW_CFG_FEATURES_DMA before
+ * using this macro.
+ */
+#define read_fw_blob_dma(var)                           \
+        read_fw         var ## _ADDR;                   \
+        write_fw        FW_CFG_DMA_ADDR;                \
+        read_fw         var ## _SIZE;                   \
+        write_fw        FW_CFG_DMA_SIZE;                \
+        mov             $(var ## _DATA|FW_CFG_DMA), %ax; \
+        mov             $BIOS_CFG_IOPORT_CFG, %edx;     \
+        outw            %ax, (%dx);                     \
+        mov             $BIOS_CFG_IOPORT_DATA, %dx;     \
+        inb             (%dx), %al;                     \
+     1: test            %al, %al;                       \
+        jz              1f;                             \
+        read_fw         FW_CFG_DMA_DONE;                \
+        jmp 1b;                                         \
+     1:
+
 #define OPTION_ROM_START					\
     .code16;						\
     .text;						\
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-07-20 14:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-17 13:39 [Qemu-devel] [PATCH 0/2] fw_cfg: Implement fast "blit" operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
2010-07-17 13:40 ` [Qemu-devel] [PATCH 1/2] Don't call fw_cfg e->callback if e->callback is NULL Richard W.M. Jones
2010-07-17 13:41 ` [Qemu-devel] [PATCH 2/2] fw_cfg: Add blit operation for copying kernel, initrd, Richard W.M. Jones
2010-07-18 20:47   ` Alexander Graf
2010-07-18 21:12     ` Richard W.M. Jones
2010-07-18 21:13       ` Alexander Graf
2010-07-18 23:59   ` Aurelien Jarno
2010-07-19  7:23     ` Richard W.M. Jones
2010-07-19  9:19       ` Aurelien Jarno
2010-07-19 10:15 ` [Qemu-devel] [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Richard W.M. Jones
2010-07-19 10:15   ` [Qemu-devel] [PATCH 1/2 version 2] Don't call fw_cfg e->callback if e->callback is NULL Richard W.M. Jones
2010-07-19 10:16   ` [Qemu-devel] [PATCH 2/2 version 2] fw_cfg: Allow guest to read kernel etc via fast, synchronous "DMA"-type operation Richard W.M. Jones
2010-07-19 10:45   ` [Qemu-devel] Re: [PATCH 0/2 version 2] fw_cfg: Implement fast "DMA"-type operation for rapidly copying in kernel, initrd [etc] into the guest Gleb Natapov
2010-07-19 10:49     ` Richard W.M. Jones
2010-07-19 10:56       ` Gleb Natapov
2010-07-19 11:30   ` [Qemu-devel] [PATCH 2/2 version 3] " Richard W.M. Jones
2010-07-20 14:41     ` [Qemu-devel] [PATCH 2/2 version 4] " Richard W.M. Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).