[Qemu-devel] [0/6] Assorted bugfixes

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [0/6] Assorted bugfixes
@ 2012-02-24  0:23 David Gibson
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 1/6] kvm: Comparison with ioctl number macros needs to be unsigned David Gibson
                   ` (5 more replies)
  0 siblings, 6 replies; 36+ messages in thread
From: David Gibson @ 2012-02-24  0:23 UTC (permalink / raw)
  To: anthony; +Cc: qemu-devel

Hi Anthony,

This series contains my current accumulated batch of qemu bugfixes
which are not ppc specific.  There are small, but real, fixes from
across the codebase.  Maintainers for the individual subsystems should
be CCed.

Please apply.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH 1/6] kvm: Comparison with ioctl number macros needs to be unsigned
  2012-02-24  0:23 [Qemu-devel] [0/6] Assorted bugfixes David Gibson
@ 2012-02-24  0:23 ` David Gibson
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 2/6] slirp: Fix assertion failure on rejected DHCP requests David Gibson
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 36+ messages in thread
From: David Gibson @ 2012-02-24  0:23 UTC (permalink / raw)
  To: anthony; +Cc: Avi Kivity, Marcelo Tossatti, qemu-devel, David Gibson

In kvm-all.c we store an ioctl cmd number in the irqchip_inject_ioctl field
of KVMState, which has type 'int'.  This seems to make sense since the
ioctl() man page says that the cmd parameter has type int.

However, the kernel treats ioctl numbers as unsigned - sys_ioctl() takes an
unsigned int, and the macros which generate ioctl numbers expand to
unsigned expressions.  Furthermore, some ioctls (IOC_READ ioctls on x86
and IOC_WRITE ioctls on powerpc) have bit 31 set, and so would be negative
if interpreted as an int. This has the surprising and compile-breaking
consequence that in kvm_irqchip_set_irq() where we do:
    return (s->irqchip_inject_ioctl == KVM_IRQ_LINE) ? 1 : event.status;
We will get a "comparison is always false due to limited range of data
type" warning from gcc if KVM_IRQ_LINE is one of the bit-31-set ioctls,
which it is on powerpc.

So, despite the fact that the man page and posix say ioctl numbers are
signed, they're actually unsigned.  The kernel uses unsigned, the glibc
header uses unsigned long, and FreeBSD, NetBSD and OSX also use unsigned
long ioctl numbers in the code.

Therefore, this patch changes the variable to be unsigned, fixing the
compile.

Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tossatti <mtossatti@redhat.com>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 kvm-all.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index c4babda..5e188bf 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -77,7 +77,10 @@ struct KVMState
     int pit_in_kernel;
     int xsave, xcrs;
     int many_ioeventfds;
-    int irqchip_inject_ioctl;
+    /* The man page (and posix) say ioctl numbers are signed int, but
+     * they're not.  Linux, glibc and *BSD all treat ioctl numbers as
+     * unsigned, and treating them as signed here can break things */
+    unsigned irqchip_inject_ioctl;
 #ifdef KVM_CAP_IRQ_ROUTING
     struct kvm_irq_routing *irq_routes;
     int nr_allocated_irq_routes;
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH 2/6] slirp: Fix assertion failure on rejected DHCP requests
  2012-02-24  0:23 [Qemu-devel] [0/6] Assorted bugfixes David Gibson
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 1/6] kvm: Comparison with ioctl number macros needs to be unsigned David Gibson
@ 2012-02-24  0:23 ` David Gibson
  2012-02-27 13:58   ` Jan Kiszka
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 3/6] USB OHCI bug fixes David Gibson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2012-02-24  0:23 UTC (permalink / raw)
  To: anthony; +Cc: Jan Kiszka, qemu-devel, David Gibson

The guest network stack might DHCPREQUEST an address that the slirp built
in dhcp server can't let it have - for example if the guest has an old
leases file from another network configuration.  In this case the dhcp
server should and does reject the request and prepares to send a DHCPNAK
to the client.

However, in this case the daddr variable in bootp_reply() is set to
0.0.0.0.  Shortly afterwards, it unconditionally attempts to pre-insert the
new client address into the ARP table.  This causes an assertion failure in
arp_address_add() because of the 0.0.0.0 address.

According to RFC2131, DHCPNAK messages for clients on the same subnet
must be sent to the broadcast address (S3.2, subpoint 2).

Cc: Jan Kiszka <jan.kiszka@siemens.com>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 slirp/bootp.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/slirp/bootp.c b/slirp/bootp.c
index efd1fe7..64eac7d 100644
--- a/slirp/bootp.c
+++ b/slirp/bootp.c
@@ -200,7 +200,8 @@ static void bootp_reply(Slirp *slirp, const struct bootp_t *bp)
             daddr.sin_addr = preq_addr;
             memcpy(bc->macaddr, client_ethaddr, ETH_ALEN);
         } else {
-            daddr.sin_addr.s_addr = 0;
+            /* DHCPNAKs should be sent to broadcast */
+            daddr.sin_addr.s_addr = 0xffffffff;
         }
     } else {
         bc = find_addr(slirp, &daddr.sin_addr, bp->bp_hwaddr);
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 2/6] slirp: Fix assertion failure on rejected DHCP requests
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 2/6] slirp: Fix assertion failure on rejected DHCP requests David Gibson
@ 2012-02-27 13:58   ` Jan Kiszka
  2012-02-28  1:07     ` David Gibson
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Kiszka @ 2012-02-27 13:58 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-devel@nongnu.org, anthony@codemonkey.ws

On 2012-02-24 01:23, David Gibson wrote:
> The guest network stack might DHCPREQUEST an address that the slirp built
> in dhcp server can't let it have - for example if the guest has an old
> leases file from another network configuration.  In this case the dhcp
> server should and does reject the request and prepares to send a DHCPNAK
> to the client.
> 
> However, in this case the daddr variable in bootp_reply() is set to
> 0.0.0.0.  Shortly afterwards, it unconditionally attempts to pre-insert the
> new client address into the ARP table.  This causes an assertion failure in
> arp_address_add() because of the 0.0.0.0 address.
> 
> According to RFC2131, DHCPNAK messages for clients on the same subnet
> must be sent to the broadcast address (S3.2, subpoint 2).
> 
> Cc: Jan Kiszka <jan.kiszka@siemens.com>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Thanks, applied to the slirp queue.

Jan

> ---
>  slirp/bootp.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/slirp/bootp.c b/slirp/bootp.c
> index efd1fe7..64eac7d 100644
> --- a/slirp/bootp.c
> +++ b/slirp/bootp.c
> @@ -200,7 +200,8 @@ static void bootp_reply(Slirp *slirp, const struct bootp_t *bp)
>              daddr.sin_addr = preq_addr;
>              memcpy(bc->macaddr, client_ethaddr, ETH_ALEN);
>          } else {
> -            daddr.sin_addr.s_addr = 0;
> +            /* DHCPNAKs should be sent to broadcast */
> +            daddr.sin_addr.s_addr = 0xffffffff;
>          }
>      } else {
>          bc = find_addr(slirp, &daddr.sin_addr, bp->bp_hwaddr);

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 2/6] slirp: Fix assertion failure on rejected DHCP requests
  2012-02-27 13:58   ` Jan Kiszka
@ 2012-02-28  1:07     ` David Gibson
  0 siblings, 0 replies; 36+ messages in thread
From: David Gibson @ 2012-02-28  1:07 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: qemu-devel@nongnu.org, anthony@codemonkey.ws

On Mon, Feb 27, 2012 at 02:58:07PM +0100, Jan Kiszka wrote:
> On 2012-02-24 01:23, David Gibson wrote:
> > The guest network stack might DHCPREQUEST an address that the slirp built
> > in dhcp server can't let it have - for example if the guest has an old
> > leases file from another network configuration.  In this case the dhcp
> > server should and does reject the request and prepares to send a DHCPNAK
> > to the client.
> > 
> > However, in this case the daddr variable in bootp_reply() is set to
> > 0.0.0.0.  Shortly afterwards, it unconditionally attempts to pre-insert the
> > new client address into the ARP table.  This causes an assertion failure in
> > arp_address_add() because of the 0.0.0.0 address.
> > 
> > According to RFC2131, DHCPNAK messages for clients on the same subnet
> > must be sent to the broadcast address (S3.2, subpoint 2).
> > 
> > Cc: Jan Kiszka <jan.kiszka@siemens.com>
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> Thanks, applied to the slirp queue.

Thanks.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH 3/6] USB OHCI bug fixes
  2012-02-24  0:23 [Qemu-devel] [0/6] Assorted bugfixes David Gibson
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 1/6] kvm: Comparison with ioctl number macros needs to be unsigned David Gibson
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 2/6] slirp: Fix assertion failure on rejected DHCP requests David Gibson
@ 2012-02-24  0:23 ` David Gibson
  2012-02-27 14:13   ` Gerd Hoffmann
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 4/6] Endian fixes for virtfs David Gibson
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2012-02-24  0:23 UTC (permalink / raw)
  To: anthony; +Cc: Wei Yang, David Gibson, qemu-devel, Gerd Hoffman

From: Wei Yang <weiyang@linux.vnet.ibm.com>

This patch fixes two bugs in the OHCI device where the device writes
back data to system memory that should be exclusively under the
control of the guest side driver.

In OHCI specification Section 5.2.7, it mentioned "In all cases, Host
Controller Driver is responsible for the insertion and removal of all
Endpoint Descriptors in the various Host Controller Endpoint
Descriptor lists".  In the ohci_frame_boundary(), ohci_put_hcca()
writes the entire hcca back including the interrupt ED lists which
should be under driver control. This violates the specification and
can race with a host driver updating that list at the same time.

In the OHCI Spec Section 4.6, Transfer Descriptor Queue Processing, it
mentioned "Since the TD pointed to by TailP is not accessed by the HC,
the Host Controller Driver can initialize that TD and link at least
one other to it without creating a coherency or synchronization
problem".  While the function ohci_put_ed() writes the entire endpoint
descriptor back including the TailP which should under driver
control. This violate the specification and can race with a host
driver updating the TD list at the same time.

In each case the solution is to make sure we don't write data which is
under driver control.

Cc: Gerd Hoffman <kraxel@redhat.com>

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/usb-ohci.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
index 7aa19fe..5f88bc3 100644
--- a/hw/usb-ohci.c
+++ b/hw/usb-ohci.c
@@ -122,6 +122,8 @@ struct ohci_hcca {
     uint16_t frame, pad;
     uint32_t done;
 };
+#define HCCA_WRITEBACK_OFFSET   offsetof(struct ohci_hcca, frame)
+#define HCCA_WRITEBACK_SIZE     8 /* frame, pad, done */
 
 static void ohci_bus_stop(OHCIState *ohci);
 static void ohci_async_cancel_device(OHCIState *ohci, USBDevice *dev);
@@ -189,6 +191,10 @@ struct ohci_ed {
     uint32_t head;
     uint32_t next;
 };
+#define ED_TAILP_OFFSET   offsetof(struct ohci_ed, tail)
+#define ED_PART1_SIZE     ED_TAILP_OFFSET
+#define ED_HEADP_OFFSET   offsetof(struct ohci_ed, head)
+#define ED_PART2_SIZE     (sizeof(struct ohci_ed) - ED_HEADP_OFFSET)
 
 /* General transfer descriptor */
 struct ohci_td {
@@ -569,7 +575,12 @@ static inline int ohci_read_hcca(OHCIState *ohci,
 static inline int ohci_put_ed(OHCIState *ohci,
                               uint32_t addr, struct ohci_ed *ed)
 {
-    return put_dwords(ohci, addr, (uint32_t *)ed, sizeof(*ed) >> 2);
+    /* ed->tail is under control of the HCD, so we need to split
+     * the write back into two parts
+     */
+    put_dwords(ohci, addr, (uint32_t *)ed, ED_PART1_SIZE >> 2);
+    return put_dwords(ohci, addr + ED_HEADP_OFFSET,
+                      (uint32_t *)((char *)ed + ED_HEADP_OFFSET), ED_PART2_SIZE >> 2);
 }
 
 static inline int ohci_put_td(OHCIState *ohci,
@@ -588,7 +599,9 @@ static inline int ohci_put_iso_td(OHCIState *ohci,
 static inline int ohci_put_hcca(OHCIState *ohci,
                                 uint32_t addr, struct ohci_hcca *hcca)
 {
-    cpu_physical_memory_write(addr + ohci->localmem_base, hcca, sizeof(*hcca));
+    cpu_physical_memory_write(addr + ohci->localmem_base + HCCA_WRITEBACK_OFFSET,
+                              (char *)hcca + HCCA_WRITEBACK_OFFSET,
+                              HCCA_WRITEBACK_SIZE);
     return 1;
 }
 
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] USB OHCI bug fixes
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 3/6] USB OHCI bug fixes David Gibson
@ 2012-02-27 14:13   ` Gerd Hoffmann
  2012-02-28  3:09     ` David Gibson
  0 siblings, 1 reply; 36+ messages in thread
From: Gerd Hoffmann @ 2012-02-27 14:13 UTC (permalink / raw)
  To: David Gibson; +Cc: Wei Yang, qemu-devel, anthony

On 02/24/12 01:23, David Gibson wrote:
> From: Wei Yang <weiyang@linux.vnet.ibm.com>
> 
> This patch fixes two bugs in the OHCI device where the device writes
> back data to system memory that should be exclusively under the
> control of the guest side driver.

Looks good.  Fails checkpatch though.

What is your merge plan btw?  Wanna pick me this up for the usb queue?
Or do you need just my ack?

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] USB OHCI bug fixes
  2012-02-27 14:13   ` Gerd Hoffmann
@ 2012-02-28  3:09     ` David Gibson
  2012-02-28  3:37       ` David Gibson
  0 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2012-02-28  3:09 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Wei Yang, qemu-devel, anthony

On Mon, Feb 27, 2012 at 03:13:02PM +0100, Gerd Hoffmann wrote:
> On 02/24/12 01:23, David Gibson wrote:
> > From: Wei Yang <weiyang@linux.vnet.ibm.com>
> > 
> > This patch fixes two bugs in the OHCI device where the device writes
> > back data to system memory that should be exclusively under the
> > control of the guest side driver.
> 
> Looks good.  Fails checkpatch though.

Oops, missed one of the overlong lines.  checkpatch clean revised
version below.

> What is your merge plan btw?  Wanna pick me this up for the usb queue?
> Or do you need just my ack?

If you could put it in your queue that would be great.

>From a4c996ba5e71dfd9f84abcf9365693229f707213 Mon Sep 17 00:00:00 2001
From: Wei Yang <weiyang@linux.vnet.ibm.com>
Date: Wed, 22 Feb 2012 12:02:06 +1100
Subject: [PATCH] USB OHCI bug fixes

This patch fixes two bugs in the OHCI device where the device writes
back data to system memory that should be exclusively under the
control of the guest side driver.

In OHCI specification Section 5.2.7, it mentioned "In all cases, Host
Controller Driver is responsible for the insertion and removal of all
Endpoint Descriptors in the various Host Controller Endpoint
Descriptor lists".  In the ohci_frame_boundary(), ohci_put_hcca()
writes the entire hcca back including the interrupt ED lists which
should be under driver control. This violates the specification and
can race with a host driver updating that list at the same time.

In the OHCI Spec Section 4.6, Transfer Descriptor Queue Processing, it
mentioned "Since the TD pointed to by TailP is not accessed by the HC,
the Host Controller Driver can initialize that TD and link at least
one other to it without creating a coherency or synchronization
problem".  While the function ohci_put_ed() writes the entire endpoint
descriptor back including the TailP which should under driver
control. This violate the specification and can race with a host
driver updating the TD list at the same time.

In each case the solution is to make sure we don't write data which is
under driver control.

Cc: Gerd Hoffman <kraxel@redhat.com>

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/usb-ohci.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
index 7aa19fe..a4b4ef9 100644
--- a/hw/usb-ohci.c
+++ b/hw/usb-ohci.c
@@ -122,6 +122,8 @@ struct ohci_hcca {
     uint16_t frame, pad;
     uint32_t done;
 };
+#define HCCA_WRITEBACK_OFFSET   offsetof(struct ohci_hcca, frame)
+#define HCCA_WRITEBACK_SIZE     8 /* frame, pad, done */
 
 static void ohci_bus_stop(OHCIState *ohci);
 static void ohci_async_cancel_device(OHCIState *ohci, USBDevice *dev);
@@ -189,6 +191,10 @@ struct ohci_ed {
     uint32_t head;
     uint32_t next;
 };
+#define ED_TAILP_OFFSET   offsetof(struct ohci_ed, tail)
+#define ED_PART1_SIZE     ED_TAILP_OFFSET
+#define ED_HEADP_OFFSET   offsetof(struct ohci_ed, head)
+#define ED_PART2_SIZE     (sizeof(struct ohci_ed) - ED_HEADP_OFFSET)
 
 /* General transfer descriptor */
 struct ohci_td {
@@ -569,7 +575,13 @@ static inline int ohci_read_hcca(OHCIState *ohci,
 static inline int ohci_put_ed(OHCIState *ohci,
                               uint32_t addr, struct ohci_ed *ed)
 {
-    return put_dwords(ohci, addr, (uint32_t *)ed, sizeof(*ed) >> 2);
+    /* ed->tail is under control of the HCD, so we need to split
+     * the write back into two parts
+     */
+    put_dwords(ohci, addr, (uint32_t *)ed, ED_PART1_SIZE >> 2);
+    return put_dwords(ohci, addr + ED_HEADP_OFFSET,
+                      (uint32_t *)((char *)ed + ED_HEADP_OFFSET),
+                      ED_PART2_SIZE >> 2);
 }
 
 static inline int ohci_put_td(OHCIState *ohci,
@@ -588,7 +600,9 @@ static inline int ohci_put_iso_td(OHCIState *ohci,
 static inline int ohci_put_hcca(OHCIState *ohci,
                                 uint32_t addr, struct ohci_hcca *hcca)
 {
-    cpu_physical_memory_write(addr + ohci->localmem_base, hcca, sizeof(*hcca));
+    cpu_physical_memory_write(addr + ohci->localmem_base + HCCA_WRITEBACK_OFFSET,
+                              (char *)hcca + HCCA_WRITEBACK_OFFSET,
+                              HCCA_WRITEBACK_SIZE);
     return 1;
 }
 
-- 
1.7.9



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] USB OHCI bug fixes
  2012-02-28  3:09     ` David Gibson
@ 2012-02-28  3:37       ` David Gibson
  0 siblings, 0 replies; 36+ messages in thread
From: David Gibson @ 2012-02-28  3:37 UTC (permalink / raw)
  To: Gerd Hoffmann, anthony, qemu-devel, Wei Yang

On Tue, Feb 28, 2012 at 02:09:17PM +1100, David Gibson wrote:
> On Mon, Feb 27, 2012 at 03:13:02PM +0100, Gerd Hoffmann wrote:
> > On 02/24/12 01:23, David Gibson wrote:
> > > From: Wei Yang <weiyang@linux.vnet.ibm.com>
> > > 
> > > This patch fixes two bugs in the OHCI device where the device writes
> > > back data to system memory that should be exclusively under the
> > > control of the guest side driver.
> > 
> > Looks good.  Fails checkpatch though.
> 
> Oops, missed one of the overlong lines.  checkpatch clean revised
> version below.
> 
> > What is your merge plan btw?  Wanna pick me this up for the usb queue?
> > Or do you need just my ack?
> 
> If you could put it in your queue that would be great.

Ugh, but don't bother with this version.  I just discovered I didn't
have the latest version from Wei Yang, I'll update and resend.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH 4/6] Endian fixes for virtfs
  2012-02-24  0:23 [Qemu-devel] [0/6] Assorted bugfixes David Gibson
                   ` (2 preceding siblings ...)
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 3/6] USB OHCI bug fixes David Gibson
@ 2012-02-24  0:23 ` David Gibson
  2012-02-24  8:29   ` Aneesh Kumar K.V
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 5/6] Endian fix an assertion in usb-msd David Gibson
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size David Gibson
  5 siblings, 1 reply; 36+ messages in thread
From: David Gibson @ 2012-02-24  0:23 UTC (permalink / raw)
  To: anthony; +Cc: David Gibson, qemu-devel, Aneesh Kumar K.V

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>

This patch fixes several endian bugs in virtfs.

Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/9pfs/virtio-9p.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index a72ffc3..c633fb9 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -1349,7 +1349,9 @@ static void v9fs_open(void *opaque)
     if (s->proto_version == V9FS_PROTO_2000L) {
         err = pdu_unmarshal(pdu, offset, "dd", &fid, &mode);
     } else {
-        err = pdu_unmarshal(pdu, offset, "db", &fid, &mode);
+        uint8_t modebyte;
+        err = pdu_unmarshal(pdu, offset, "db", &fid, &modebyte);
+        mode = modebyte;
     }
     if (err < 0) {
         goto out_nofid;
@@ -3260,9 +3262,9 @@ void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
 
         ptr = pdu->elem.out_sg[0].iov_base;
 
-        memcpy(&pdu->size, ptr, 4);
+        pdu->size = le32_to_cpu(*(uint32_t *)ptr);
         pdu->id = ptr[4];
-        memcpy(&pdu->tag, ptr + 5, 2);
+        pdu->tag = le16_to_cpu(*(uint16_t *)(ptr + 5));
         qemu_co_queue_init(&pdu->complete);
         submit_pdu(s, pdu);
     }
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] Endian fixes for virtfs
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 4/6] Endian fixes for virtfs David Gibson
@ 2012-02-24  8:29   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 36+ messages in thread
From: Aneesh Kumar K.V @ 2012-02-24  8:29 UTC (permalink / raw)
  To: David Gibson, anthony; +Cc: qemu-devel

On Fri, 24 Feb 2012 11:23:30 +1100, David Gibson <david@gibson.dropbear.id.au> wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> 
> This patch fixes several endian bugs in virtfs.
> 
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/9pfs/virtio-9p.c |    8 +++++---
>  1 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
> index a72ffc3..c633fb9 100644
> --- a/hw/9pfs/virtio-9p.c
> +++ b/hw/9pfs/virtio-9p.c
> @@ -1349,7 +1349,9 @@ static void v9fs_open(void *opaque)
>      if (s->proto_version == V9FS_PROTO_2000L) {
>          err = pdu_unmarshal(pdu, offset, "dd", &fid, &mode);
>      } else {
> -        err = pdu_unmarshal(pdu, offset, "db", &fid, &mode);
> +        uint8_t modebyte;
> +        err = pdu_unmarshal(pdu, offset, "db", &fid, &modebyte);
> +        mode = modebyte;
>      }
>      if (err < 0) {
>          goto out_nofid;
> @@ -3260,9 +3262,9 @@ void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
> 
>          ptr = pdu->elem.out_sg[0].iov_base;
> 
> -        memcpy(&pdu->size, ptr, 4);
> +        pdu->size = le32_to_cpu(*(uint32_t *)ptr);
>          pdu->id = ptr[4];
> -        memcpy(&pdu->tag, ptr + 5, 2);
> +        pdu->tag = le16_to_cpu(*(uint16_t *)(ptr + 5));
>          qemu_co_queue_init(&pdu->complete);
>          submit_pdu(s, pdu);
>      }

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

-aneesh

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH 5/6] Endian fix an assertion in usb-msd
  2012-02-24  0:23 [Qemu-devel] [0/6] Assorted bugfixes David Gibson
                   ` (3 preceding siblings ...)
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 4/6] Endian fixes for virtfs David Gibson
@ 2012-02-24  0:23 ` David Gibson
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size David Gibson
  5 siblings, 0 replies; 36+ messages in thread
From: David Gibson @ 2012-02-24  0:23 UTC (permalink / raw)
  To: anthony; +Cc: David Gibson, qemu-devel, Gerd Hoffman

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>

This fixes a broken endian assumption in an assertion in usb-msd.

Cc: Gerd Hoffman <kraxel@redhat.com>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/usb-msd.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/usb-msd.c b/hw/usb-msd.c
index c933efe..2e0da27 100644
--- a/hw/usb-msd.c
+++ b/hw/usb-msd.c
@@ -195,7 +195,7 @@ static void usb_msd_send_status(MSDState *s, USBPacket *p)
     DPRINTF("Command status %d tag 0x%x, len %zd\n",
             s->csw.status, s->csw.tag, p->iov.size);
 
-    assert(s->csw.sig == 0x53425355);
+    assert(s->csw.sig == cpu_to_le32(0x53425355));
     len = MIN(sizeof(s->csw), p->iov.size);
     usb_packet_copy(p, &s->csw, len);
     memset(&s->csw, 0, sizeof(s->csw));
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-24  0:23 [Qemu-devel] [0/6] Assorted bugfixes David Gibson
                   ` (4 preceding siblings ...)
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 5/6] Endian fix an assertion in usb-msd David Gibson
@ 2012-02-24  0:23 ` David Gibson
  2012-02-24  1:03   ` Benjamin Herrenschmidt
                     ` (3 more replies)
  5 siblings, 4 replies; 36+ messages in thread
From: David Gibson @ 2012-02-24  0:23 UTC (permalink / raw)
  To: anthony; +Cc: Marcelo Tossatti, qemu-devel, David Gibson, Avi Kivity

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>

If the kernel page size is larger than TARGET_PAGE_SIZE, which
happens for example on ppc64 with kernels compiled for 64K pages,
the dirty tracking doesn't work.

Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tossatti <mtossatti@redhat.com>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 kvm-all.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 5e188bf..3f8cfd9 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -348,10 +348,11 @@ static int kvm_set_migration_log(int enable)
 static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
                                          unsigned long *bitmap)
 {
-    unsigned int i, j;
+  unsigned int i, j;
     unsigned long page_number, c;
     target_phys_addr_t addr, addr1;
     unsigned int len = ((section->size / TARGET_PAGE_SIZE) + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
+    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
 
     /*
      * bitmap-traveling is faster than memory-traveling (for addr...)
@@ -363,10 +364,10 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
             do {
                 j = ffsl(c) - 1;
                 c &= ~(1ul << j);
-                page_number = i * HOST_LONG_BITS + j;
+                page_number = (i * HOST_LONG_BITS + j) * hpratio;
                 addr1 = page_number * TARGET_PAGE_SIZE;
                 addr = section->offset_within_region + addr1;
-                memory_region_set_dirty(section->mr, addr, TARGET_PAGE_SIZE);
+                memory_region_set_dirty(section->mr, addr, TARGET_PAGE_SIZE * hpratio);
             } while (c != 0);
         }
     }
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size David Gibson
@ 2012-02-24  1:03   ` Benjamin Herrenschmidt
  2012-02-26 21:41   ` Blue Swirl
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 36+ messages in thread
From: Benjamin Herrenschmidt @ 2012-02-24  1:03 UTC (permalink / raw)
  To: David Gibson; +Cc: Marcelo Tossatti, qemu-devel, anthony, Avi Kivity

> diff --git a/kvm-all.c b/kvm-all.c
> index 5e188bf..3f8cfd9 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -348,10 +348,11 @@ static int kvm_set_migration_log(int enable)
>  static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
>                                           unsigned long *bitmap)
>  {
> -    unsigned int i, j;
> +  unsigned int i, j;

That hunk looks like accidental damage, you should remove it :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size David Gibson
  2012-02-24  1:03   ` Benjamin Herrenschmidt
@ 2012-02-26 21:41   ` Blue Swirl
  2012-02-27  0:16     ` David Gibson
  2012-02-27  0:36     ` Alexander Graf
  2012-02-28 12:32   ` Avi Kivity
  2012-02-28 23:32   ` Alexander Graf
  3 siblings, 2 replies; 36+ messages in thread
From: Blue Swirl @ 2012-02-26 21:41 UTC (permalink / raw)
  To: David Gibson; +Cc: Marcelo Tossatti, qemu-devel, anthony, Avi Kivity

On Fri, Feb 24, 2012 at 00:23, David Gibson <david@gibson.dropbear.id.au> wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
> If the kernel page size is larger than TARGET_PAGE_SIZE, which
> happens for example on ppc64 with kernels compiled for 64K pages,
> the dirty tracking doesn't work.

I think a better solution would be to push this to memory API and
underlying exec.c dirty tracking so that they use the same page size
as kernel (only in this KVM case, in general dirty tracking should
match TARGET_PAGE_SIZE granularity).

> Cc: Avi Kivity <avi@redhat.com>
> Cc: Marcelo Tossatti <mtossatti@redhat.com>
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  kvm-all.c |    7 ++++---
>  1 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/kvm-all.c b/kvm-all.c
> index 5e188bf..3f8cfd9 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -348,10 +348,11 @@ static int kvm_set_migration_log(int enable)
>  static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
>                                          unsigned long *bitmap)
>  {
> -    unsigned int i, j;
> +  unsigned int i, j;
>     unsigned long page_number, c;
>     target_phys_addr_t addr, addr1;
>     unsigned int len = ((section->size / TARGET_PAGE_SIZE) + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
> +    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
>
>     /*
>      * bitmap-traveling is faster than memory-traveling (for addr...)
> @@ -363,10 +364,10 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
>             do {
>                 j = ffsl(c) - 1;
>                 c &= ~(1ul << j);
> -                page_number = i * HOST_LONG_BITS + j;
> +                page_number = (i * HOST_LONG_BITS + j) * hpratio;
>                 addr1 = page_number * TARGET_PAGE_SIZE;
>                 addr = section->offset_within_region + addr1;
> -                memory_region_set_dirty(section->mr, addr, TARGET_PAGE_SIZE);
> +                memory_region_set_dirty(section->mr, addr, TARGET_PAGE_SIZE * hpratio);
>             } while (c != 0);
>         }
>     }
> --
> 1.7.9
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-26 21:41   ` Blue Swirl
@ 2012-02-27  0:16     ` David Gibson
  2012-02-27  0:25       ` Benjamin Herrenschmidt
  2012-02-27  0:36     ` Alexander Graf
  1 sibling, 1 reply; 36+ messages in thread
From: David Gibson @ 2012-02-27  0:16 UTC (permalink / raw)
  To: Blue Swirl
  Cc: Benjamin Herrenschmidt, Marcelo Tossatti, qemu-devel, anthony,
	Avi Kivity

On Sun, Feb 26, 2012 at 09:41:17PM +0000, Blue Swirl wrote:
> On Fri, Feb 24, 2012 at 00:23, David Gibson <david@gibson.dropbear.id.au> wrote:
> > From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> >
> > If the kernel page size is larger than TARGET_PAGE_SIZE, which
> > happens for example on ppc64 with kernels compiled for 64K pages,
> > the dirty tracking doesn't work.
> 
> I think a better solution would be to push this to memory API and
> underlying exec.c dirty tracking so that they use the same page size
> as kernel (only in this KVM case, in general dirty tracking should
> match TARGET_PAGE_SIZE granularity).

I'm having trouble reconciling the two parts of this comment.  If it
should be in terms of TARGET_PAGE_SIZE generally, why _not_ keep it
that way always, and just do a fixup when we have to send the data to
the host kernel in terms of host kernel page size?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-27  0:16     ` David Gibson
@ 2012-02-27  0:25       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 36+ messages in thread
From: Benjamin Herrenschmidt @ 2012-02-27  0:25 UTC (permalink / raw)
  To: David Gibson
  Cc: Blue Swirl, Marcelo Tossatti, qemu-devel, anthony, Avi Kivity

On Mon, 2012-02-27 at 11:16 +1100, David Gibson wrote:
> > > If the kernel page size is larger than TARGET_PAGE_SIZE, which
> > > happens for example on ppc64 with kernels compiled for 64K pages,
> > > the dirty tracking doesn't work.
> > 
> > I think a better solution would be to push this to memory API and
> > underlying exec.c dirty tracking so that they use the same page size
> > as kernel (only in this KVM case, in general dirty tracking should
> > match TARGET_PAGE_SIZE granularity). 

That sounds horrible... you propose a -MUCH- more invasive change to a
nasty & complex core piece of code to deal with what is fixed by a
2-liner patch ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-26 21:41   ` Blue Swirl
  2012-02-27  0:16     ` David Gibson
@ 2012-02-27  0:36     ` Alexander Graf
  2012-03-03 15:29       ` Blue Swirl
  1 sibling, 1 reply; 36+ messages in thread
From: Alexander Graf @ 2012-02-27  0:36 UTC (permalink / raw)
  To: Blue Swirl
  Cc: Avi Kivity, Marcelo Tossatti, qemu-devel, anthony, David Gibson


On 26.02.2012, at 22:41, Blue Swirl wrote:

> On Fri, Feb 24, 2012 at 00:23, David Gibson <david@gibson.dropbear.id.au> wrote:
>> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> 
>> If the kernel page size is larger than TARGET_PAGE_SIZE, which
>> happens for example on ppc64 with kernels compiled for 64K pages,
>> the dirty tracking doesn't work.
> 
> I think a better solution would be to push this to memory API and
> underlying exec.c dirty tracking so that they use the same page size
> as kernel (only in this KVM case, in general dirty tracking should
> match TARGET_PAGE_SIZE granularity).

Yeah, that would allow us to make sure we only align MMIO regions where we can, but I don't think it's an easy change to make. And this way the common page size throughout QEMU is TARGET_PAGE_SIZE, which other pieces of code rely on. Also, dynamically changing TARGET_PAGE_SIZE has unknown performance implications.

So for the time being, I definitely think this is the right approach. It's easy and isolated :).


Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-27  0:36     ` Alexander Graf
@ 2012-03-03 15:29       ` Blue Swirl
  0 siblings, 0 replies; 36+ messages in thread
From: Blue Swirl @ 2012-03-03 15:29 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Avi Kivity, Marcelo Tossatti, qemu-devel, anthony, David Gibson

On Mon, Feb 27, 2012 at 00:36, Alexander Graf <agraf@suse.de> wrote:
>
> On 26.02.2012, at 22:41, Blue Swirl wrote:
>
>> On Fri, Feb 24, 2012 at 00:23, David Gibson <david@gibson.dropbear.id.au> wrote:
>>> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>
>>> If the kernel page size is larger than TARGET_PAGE_SIZE, which
>>> happens for example on ppc64 with kernels compiled for 64K pages,
>>> the dirty tracking doesn't work.
>>
>> I think a better solution would be to push this to memory API and
>> underlying exec.c dirty tracking so that they use the same page size
>> as kernel (only in this KVM case, in general dirty tracking should
>> match TARGET_PAGE_SIZE granularity).
>
> Yeah, that would allow us to make sure we only align MMIO regions where we can, but I don't think it's an easy change to make. And this way the common page size throughout QEMU is TARGET_PAGE_SIZE, which other pieces of code rely on. Also, dynamically changing TARGET_PAGE_SIZE has unknown performance implications.
>
> So for the time being, I definitely think this is the right approach. It's easy and isolated :).

Laziness ;-) This could be the initial approach though and someone
could improve upon it later.

For better performance, KVM and QEMU could even share the bit maps (if
the page sizes are kept in synch) so there would be no need to
synchronize. But if MMU page sizes are not constant (huge pages, huge
gaps), also bit map approach could be suboptimal and some other way
could be needed.

>
>
> Alex
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size David Gibson
  2012-02-24  1:03   ` Benjamin Herrenschmidt
  2012-02-26 21:41   ` Blue Swirl
@ 2012-02-28 12:32   ` Avi Kivity
  2012-02-28 21:48     ` Benjamin Herrenschmidt
  2012-02-28 23:32   ` Alexander Graf
  3 siblings, 1 reply; 36+ messages in thread
From: Avi Kivity @ 2012-02-28 12:32 UTC (permalink / raw)
  To: David Gibson; +Cc: Marcelo Tosatti, qemu-devel, anthony

On 02/24/2012 02:23 AM, David Gibson wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
> If the kernel page size is larger than TARGET_PAGE_SIZE, which
> happens for example on ppc64 with kernels compiled for 64K pages,
> the dirty tracking doesn't work.
>
> Cc: Avi Kivity <avi@redhat.com>
> Cc: Marcelo Tossatti <mtossatti@redhat.com>
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  kvm-all.c |    7 ++++---
>  1 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/kvm-all.c b/kvm-all.c
> index 5e188bf..3f8cfd9 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -348,10 +348,11 @@ static int kvm_set_migration_log(int enable)
>  static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
>                                           unsigned long *bitmap)
>  {
> -    unsigned int i, j;
> +  unsigned int i, j;
>      unsigned long page_number, c;
>      target_phys_addr_t addr, addr1;
>      unsigned int len = ((section->size / TARGET_PAGE_SIZE) + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
> +    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
>  
>

What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-28 12:32   ` Avi Kivity
@ 2012-02-28 21:48     ` Benjamin Herrenschmidt
  2012-03-04 10:49       ` Avi Kivity
  0 siblings, 1 reply; 36+ messages in thread
From: Benjamin Herrenschmidt @ 2012-02-28 21:48 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, qemu-devel, anthony, David Gibson

On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:

> What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?

We have yet to encounter such a case. It's not currently possible on
power (some old embedded chips could do 1K and 2K page sizes in the TLB
iirc but we never supported that in Linux and it's being phased out in
HW).

I suggest that gets dealt with when/if it needs to, which means probably
never :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-28 21:48     ` Benjamin Herrenschmidt
@ 2012-03-04 10:49       ` Avi Kivity
  2012-03-04 11:53         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 36+ messages in thread
From: Avi Kivity @ 2012-03-04 10:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Marcelo Tosatti, qemu-devel, anthony, David Gibson

On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
>
> > What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
>
> We have yet to encounter such a case. It's not currently possible on
> power (some old embedded chips could do 1K and 2K page sizes in the TLB
> iirc but we never supported that in Linux and it's being phased out in
> HW).
>
> I suggest that gets dealt with when/if it needs to, which means probably
> never :-)

Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
on a 64k host?

Maybe I'm misremembering or misunderstanding something.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 10:49       ` Avi Kivity
@ 2012-03-04 11:53         ` Benjamin Herrenschmidt
  2012-03-04 12:18           ` Avi Kivity
  2012-03-04 16:46           ` Andreas Färber
  0 siblings, 2 replies; 36+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-04 11:53 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, qemu-devel, anthony, David Gibson

On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
> On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
> > On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
> >
> > > What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
> >
> > We have yet to encounter such a case. It's not currently possible on
> > power (some old embedded chips could do 1K and 2K page sizes in the TLB
> > iirc but we never supported that in Linux and it's being phased out in
> > HW).
> >
> > I suggest that gets dealt with when/if it needs to, which means probably
> > never :-)
> 
> Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
> on a 64k host?
> 
> Maybe I'm misremembering or misunderstanding something.

TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
#define.

The host kernel exposes the dirty bit map with a bit per -host- kernel
PAGE_SIZE (which is what getpagesize() gets you in qemu).

My patch makes thus makes things work when the host uses 64K page sizes.
In all cases, the guest page size is irrelevant, this is purely a
problem between the host kernel and qemu.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 11:53         ` Benjamin Herrenschmidt
@ 2012-03-04 12:18           ` Avi Kivity
  2012-03-04 16:46           ` Andreas Färber
  1 sibling, 0 replies; 36+ messages in thread
From: Avi Kivity @ 2012-03-04 12:18 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Marcelo Tosatti, qemu-devel, anthony, David Gibson

On 03/04/2012 01:53 PM, Benjamin Herrenschmidt wrote:
> On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
> > On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
> > > On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
> > >
> > > > What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
> > >
> > > We have yet to encounter such a case. It's not currently possible on
> > > power (some old embedded chips could do 1K and 2K page sizes in the TLB
> > > iirc but we never supported that in Linux and it's being phased out in
> > > HW).
> > >
> > > I suggest that gets dealt with when/if it needs to, which means probably
> > > never :-)
> > 
> > Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
> > on a 64k host?
> > 
> > Maybe I'm misremembering or misunderstanding something.
>
> TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
> #define.
>
> The host kernel exposes the dirty bit map with a bit per -host- kernel
> PAGE_SIZE (which is what getpagesize() gets you in qemu).
>
> My patch makes thus makes things work when the host uses 64K page sizes.
> In all cases, the guest page size is irrelevant, this is purely a
> problem between the host kernel and qemu.

Right (and I actually knew all this stuff before :( ).

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 11:53         ` Benjamin Herrenschmidt
  2012-03-04 12:18           ` Avi Kivity
@ 2012-03-04 16:46           ` Andreas Färber
  2012-03-04 18:46             ` Alexander Graf
  2012-03-04 21:17             ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 36+ messages in thread
From: Andreas Färber @ 2012-03-04 16:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Marcelo Tosatti, Alexander Graf, qemu-devel, Avi Kivity, anthony,
	David Gibson

Am 04.03.2012 12:53, schrieb Benjamin Herrenschmidt:
> On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
>> On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
>>> On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
>>>
>>>> What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
>>>
>>> We have yet to encounter such a case. It's not currently possible on
>>> power (some old embedded chips could do 1K and 2K page sizes in the TLB
>>> iirc but we never supported that in Linux and it's being phased out in
>>> HW).
>>>
>>> I suggest that gets dealt with when/if it needs to, which means probably
>>> never :-)
>>
>> Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
>> on a 64k host?
>>
>> Maybe I'm misremembering or misunderstanding something.

> TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
> #define.

Except for ppcemb-softmmu (1k), which is irrelevant for KVM AFAIU.

Maybe just add an assert and be done with it?

For the record, I am hoping to get rid of some of those cpu.h defines in
a later stage of QOM'ification.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 16:46           ` Andreas Färber
@ 2012-03-04 18:46             ` Alexander Graf
  2012-03-04 20:21               ` Andreas Färber
  2012-03-04 21:17             ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 36+ messages in thread
From: Alexander Graf @ 2012-03-04 18:46 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Marcelo Tosatti, qemu-devel@nongnu.org, Avi Kivity,
	anthony@codemonkey.ws, David Gibson



On 04.03.2012, at 17:46, Andreas Färber <afaerber@suse.de> wrote:

> Am 04.03.2012 12:53, schrieb Benjamin Herrenschmidt:
>> On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
>>> On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
>>>> On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
>>>> 
>>>>> What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
>>>> 
>>>> We have yet to encounter such a case. It's not currently possible on
>>>> power (some old embedded chips could do 1K and 2K page sizes in the TLB
>>>> iirc but we never supported that in Linux and it's being phased out in
>>>> HW).
>>>> 
>>>> I suggest that gets dealt with when/if it needs to, which means probably
>>>> never :-)
>>> 
>>> Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
>>> on a 64k host?
>>> 
>>> Maybe I'm misremembering or misunderstanding something.
> 
>> TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
>> #define.
> 
> Except for ppcemb-softmmu (1k), which is irrelevant for KVM AFAIU.
> 
> Maybe just add an assert and be done with it?

Assert for what? Linux page size of 64k is something perfectly normal on ppc. The hardware can always do at least 4k maps however.

Alex

> 
> For the record, I am hoping to get rid of some of those cpu.h defines in
> a later stage of QOM'ification.
> 
> Andreas
> 
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 18:46             ` Alexander Graf
@ 2012-03-04 20:21               ` Andreas Färber
  2012-03-04 20:25                 ` Alexander Graf
  0 siblings, 1 reply; 36+ messages in thread
From: Andreas Färber @ 2012-03-04 20:21 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Marcelo Tosatti, qemu-devel@nongnu.org, Avi Kivity,
	anthony@codemonkey.ws, David Gibson

Am 04.03.2012 19:46, schrieb Alexander Graf:
> 
> 
> On 04.03.2012, at 17:46, Andreas Färber <afaerber@suse.de> wrote:
> 
>> Am 04.03.2012 12:53, schrieb Benjamin Herrenschmidt:
>>> On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
>>>> On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
>>>>> On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
>>>>>
>>>>>> What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
>>>>>
>>>>> We have yet to encounter such a case. It's not currently possible on
>>>>> power (some old embedded chips could do 1K and 2K page sizes in the TLB
>>>>> iirc but we never supported that in Linux and it's being phased out in
>>>>> HW).
>>>>>
>>>>> I suggest that gets dealt with when/if it needs to, which means probably
>>>>> never :-)
>>>>
>>>> Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
>>>> on a 64k host?
>>>>
>>>> Maybe I'm misremembering or misunderstanding something.
>>
>>> TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
>>> #define.
>>
>> Except for ppcemb-softmmu (1k), which is irrelevant for KVM AFAIU.
>>
>> Maybe just add an assert and be done with it?
> 
> Assert for what? Linux page size of 64k is something perfectly normal on ppc. The hardware can always do at least 4k maps however.

g_assert(TARGET_PAGE_SIZE <= getpagesize())

Just declare the above case as unsupported and abort if we encounter it.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 20:21               ` Andreas Färber
@ 2012-03-04 20:25                 ` Alexander Graf
  2012-03-04 20:31                   ` Andreas Färber
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander Graf @ 2012-03-04 20:25 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Marcelo Tosatti, qemu-devel@nongnu.org, Avi Kivity,
	anthony@codemonkey.ws, David Gibson



On 04.03.2012, at 21:21, Andreas Färber <afaerber@suse.de> wrote:

> Am 04.03.2012 19:46, schrieb Alexander Graf:
>> 
>> 
>> On 04.03.2012, at 17:46, Andreas Färber <afaerber@suse.de> wrote:
>> 
>>> Am 04.03.2012 12:53, schrieb Benjamin Herrenschmidt:
>>>> On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
>>>>> On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
>>>>>> On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
>>>>>> 
>>>>>>> What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
>>>>>> 
>>>>>> We have yet to encounter such a case. It's not currently possible on
>>>>>> power (some old embedded chips could do 1K and 2K page sizes in the TLB
>>>>>> iirc but we never supported that in Linux and it's being phased out in
>>>>>> HW).
>>>>>> 
>>>>>> I suggest that gets dealt with when/if it needs to, which means probably
>>>>>> never :-)
>>>>> 
>>>>> Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
>>>>> on a 64k host?
>>>>> 
>>>>> Maybe I'm misremembering or misunderstanding something.
>>> 
>>>> TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
>>>> #define.
>>> 
>>> Except for ppcemb-softmmu (1k), which is irrelevant for KVM AFAIU.
>>> 
>>> Maybe just add an assert and be done with it?
>> 
>> Assert for what? Linux page size of 64k is something perfectly normal on ppc. The hardware can always do at least 4k maps however.
> 
> g_assert(TARGET_PAGE_SIZE <= getpagesize())
> 
> Just declare the above case as unsupported and abort if we encounter it.

What I'm trying to tell you is that it's the default case on book3s ppc! ;)

Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 20:25                 ` Alexander Graf
@ 2012-03-04 20:31                   ` Andreas Färber
  2012-03-04 20:59                     ` Alexander Graf
  0 siblings, 1 reply; 36+ messages in thread
From: Andreas Färber @ 2012-03-04 20:31 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Marcelo Tosatti, qemu-devel@nongnu.org, Avi Kivity,
	anthony@codemonkey.ws, David Gibson

Am 04.03.2012 21:25, schrieb Alexander Graf:
> 
> 
> On 04.03.2012, at 21:21, Andreas Färber <afaerber@suse.de> wrote:
> 
>> Am 04.03.2012 19:46, schrieb Alexander Graf:
>>>
>>>
>>> On 04.03.2012, at 17:46, Andreas Färber <afaerber@suse.de> wrote:
>>>
>>>> Am 04.03.2012 12:53, schrieb Benjamin Herrenschmidt:
>>>>> On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
>>>>>> On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
>>>>>>> On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
>>>>>>>
>>>>>>>> What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
>>>>>>>
>>>>>>> We have yet to encounter such a case. It's not currently possible on
>>>>>>> power (some old embedded chips could do 1K and 2K page sizes in the TLB
>>>>>>> iirc but we never supported that in Linux and it's being phased out in
>>>>>>> HW).
>>>>>>>
>>>>>>> I suggest that gets dealt with when/if it needs to, which means probably
>>>>>>> never :-)
>>>>>>
>>>>>> Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
>>>>>> on a 64k host?
>>>>>>
>>>>>> Maybe I'm misremembering or misunderstanding something.
>>>>
>>>>> TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
>>>>> #define.
>>>>
>>>> Except for ppcemb-softmmu (1k), which is irrelevant for KVM AFAIU.
>>>>
>>>> Maybe just add an assert and be done with it?
>>>
>>> Assert for what? Linux page size of 64k is something perfectly normal on ppc. The hardware can always do at least 4k maps however.
>>
>> g_assert(TARGET_PAGE_SIZE <= getpagesize())
>>
>> Just declare the above case as unsupported and abort if we encounter it.
> 
> What I'm trying to tell you is that it's the default case on book3s ppc! ;)

Exactly, which is why I'm saying just ignore the weird embedded case. :)

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 20:31                   ` Andreas Färber
@ 2012-03-04 20:59                     ` Alexander Graf
  2012-03-04 21:19                       ` Benjamin Herrenschmidt
  2012-03-04 21:21                       ` Andreas Färber
  0 siblings, 2 replies; 36+ messages in thread
From: Alexander Graf @ 2012-03-04 20:59 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Marcelo Tosatti, qemu-devel@nongnu.org, Avi Kivity,
	anthony@codemonkey.ws, David Gibson



On 04.03.2012, at 21:31, Andreas Färber <afaerber@suse.de> wrote:

> Am 04.03.2012 21:25, schrieb Alexander Graf:
>> 
>> 
>> On 04.03.2012, at 21:21, Andreas Färber <afaerber@suse.de> wrote:
>> 
>>> Am 04.03.2012 19:46, schrieb Alexander Graf:
>>>> 
>>>> 
>>>> On 04.03.2012, at 17:46, Andreas Färber <afaerber@suse.de> wrote:
>>>> 
>>>>> Am 04.03.2012 12:53, schrieb Benjamin Herrenschmidt:
>>>>>> On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
>>>>>>> On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
>>>>>>>> On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
>>>>>>>> 
>>>>>>>>> What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
>>>>>>>> 
>>>>>>>> We have yet to encounter such a case. It's not currently possible on
>>>>>>>> power (some old embedded chips could do 1K and 2K page sizes in the TLB
>>>>>>>> iirc but we never supported that in Linux and it's being phased out in
>>>>>>>> HW).
>>>>>>>> 
>>>>>>>> I suggest that gets dealt with when/if it needs to, which means probably
>>>>>>>> never :-)
>>>>>>> 
>>>>>>> Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
>>>>>>> on a 64k host?
>>>>>>> 
>>>>>>> Maybe I'm misremembering or misunderstanding something.
>>>>> 
>>>>>> TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
>>>>>> #define.
>>>>> 
>>>>> Except for ppcemb-softmmu (1k), which is irrelevant for KVM AFAIU.
>>>>> 
>>>>> Maybe just add an assert and be done with it?
>>>> 
>>>> Assert for what? Linux page size of 64k is something perfectly normal on ppc. The hardware can always do at least 4k maps however.
>>> 
>>> g_assert(TARGET_PAGE_SIZE <= getpagesize())
>>> 
>>> Just declare the above case as unsupported and abort if we encounter it.
>> 
>> What I'm trying to tell you is that it's the default case on book3s ppc! ;)
> 
> Exactly, which is why I'm saying just ignore the weird embedded case. :)

Ugh. Sorry, apparently I can't read :). So you're saying 'break for ppcemb'. Hrm. Not sure that'd be all that great for 440, since there host pagesize is still 4k, but T_P_S is 1k.

Alex


> 
> Andreas
> 
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 20:59                     ` Alexander Graf
@ 2012-03-04 21:19                       ` Benjamin Herrenschmidt
  2012-03-04 21:45                         ` Alexander Graf
  2012-03-04 21:21                       ` Andreas Färber
  1 sibling, 1 reply; 36+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-04 21:19 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Marcelo Tosatti, qemu-devel@nongnu.org, Avi Kivity,
	anthony@codemonkey.ws, Andreas Färber, David Gibson

On Sun, 2012-03-04 at 21:59 +0100, Alexander Graf wrote:
> >>> g_assert(TARGET_PAGE_SIZE <= getpagesize())
> >>> 
> >>> Just declare the above case as unsupported and abort if we
> encounter it.
> >> 
> >> What I'm trying to tell you is that it's the default case on book3s
> ppc! ;)
> > 
> > Exactly, which is why I'm saying just ignore the weird embedded
> case. :)
> 
> Ugh. Sorry, apparently I can't read :). So you're saying 'break for
> ppcemb'. Hrm. Not sure that'd be all that great for 440, since there
> host pagesize is still 4k, but T_P_S is 1k.

No, Alex, you are reading Andreas assert backward :-)

What he suggests is "break if TARGET_PAGE_SIZE > getpagesize()" which
currently cannot happen. Even embedded 1k TARGET_PAGE_SIZE is fine. It's
getpagesize() 1k that wouldn't be but it also never happens.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 21:19                       ` Benjamin Herrenschmidt
@ 2012-03-04 21:45                         ` Alexander Graf
  0 siblings, 0 replies; 36+ messages in thread
From: Alexander Graf @ 2012-03-04 21:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Marcelo Tosatti, qemu-devel@nongnu.org, Avi Kivity,
	anthony@codemonkey.ws, Andreas Färber, David Gibson



On 04.03.2012, at 22:19, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Sun, 2012-03-04 at 21:59 +0100, Alexander Graf wrote:
>>>>> g_assert(TARGET_PAGE_SIZE <= getpagesize())
>>>>> 
>>>>> Just declare the above case as unsupported and abort if we
>> encounter it.
>>>> 
>>>> What I'm trying to tell you is that it's the default case on book3s
>> ppc! ;)
>>> 
>>> Exactly, which is why I'm saying just ignore the weird embedded
>> case. :)
>> 
>> Ugh. Sorry, apparently I can't read :). So you're saying 'break for
>> ppcemb'. Hrm. Not sure that'd be all that great for 440, since there
>> host pagesize is still 4k, but T_P_S is 1k.
> 
> No, Alex, you are reading Andreas assert backward :-)
> 
> What he suggests is "break if TARGET_PAGE_SIZE > getpagesize()" which
> currently cannot happen. Even embedded 1k TARGET_PAGE_SIZE is fine. It's
> getpagesize() 1k that wouldn't be but it also never happens.

O_o. Ok, that would obviously work. We never get hostpagesize <4k and T_P_S is always 1k/4k, never bigger.

Alex

> 
> Cheers,
> Ben.
> 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 20:59                     ` Alexander Graf
  2012-03-04 21:19                       ` Benjamin Herrenschmidt
@ 2012-03-04 21:21                       ` Andreas Färber
  1 sibling, 0 replies; 36+ messages in thread
From: Andreas Färber @ 2012-03-04 21:21 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Marcelo Tosatti, qemu-devel@nongnu.org, Avi Kivity,
	anthony@codemonkey.ws, David Gibson

Am 04.03.2012 21:59, schrieb Alexander Graf:
> 
> 
> On 04.03.2012, at 21:31, Andreas Färber <afaerber@suse.de> wrote:
> 
>> Am 04.03.2012 21:25, schrieb Alexander Graf:
>>>
>>>
>>> On 04.03.2012, at 21:21, Andreas Färber <afaerber@suse.de> wrote:
>>>
>>>> Am 04.03.2012 19:46, schrieb Alexander Graf:
>>>>>
>>>>>
>>>>> On 04.03.2012, at 17:46, Andreas Färber <afaerber@suse.de> wrote:
>>>>>
>>>>>> Am 04.03.2012 12:53, schrieb Benjamin Herrenschmidt:
>>>>>>> On Sun, 2012-03-04 at 12:49 +0200, Avi Kivity wrote:
>>>>>>>> On 02/28/2012 11:48 PM, Benjamin Herrenschmidt wrote:
>>>>>>>>> On Tue, 2012-02-28 at 14:32 +0200, Avi Kivity wrote:
>>>>>>>>>
>>>>>>>>>> What if TARGET_PAGE_SIZE > getpagesize()?  Or is that impossible?
>>>>>>>>>
>>>>>>>>> We have yet to encounter such a case. It's not currently possible on
>>>>>>>>> power (some old embedded chips could do 1K and 2K page sizes in the TLB
>>>>>>>>> iirc but we never supported that in Linux and it's being phased out in
>>>>>>>>> HW).
>>>>>>>>>
>>>>>>>>> I suggest that gets dealt with when/if it needs to, which means probably
>>>>>>>>> never :-)
>>>>>>>>
>>>>>>>> Doesn't ppc support both 4k and 64k pages?  Suppose you run a 4k guest
>>>>>>>> on a 64k host?
>>>>>>>>
>>>>>>>> Maybe I'm misremembering or misunderstanding something.
>>>>>>
>>>>>>> TARGET_PAGE_SIZE in qemu is always 4k for powerpc, it's a compile time
>>>>>>> #define.
>>>>>>
>>>>>> Except for ppcemb-softmmu (1k), which is irrelevant for KVM AFAIU.
>>>>>>
>>>>>> Maybe just add an assert and be done with it?
>>>>>
>>>>> Assert for what? Linux page size of 64k is something perfectly normal on ppc. The hardware can always do at least 4k maps however.
>>>>
>>>> g_assert(TARGET_PAGE_SIZE <= getpagesize())
>>>>
>>>> Just declare the above case as unsupported and abort if we encounter it.
>>>
>>> What I'm trying to tell you is that it's the default case on book3s ppc! ;)
>>
>> Exactly, which is why I'm saying just ignore the weird embedded case. :)
> 
> [...] So you're saying 'break for ppcemb'. Hrm. Not sure that'd be all that great for 440, since there host pagesize is still 4k, but T_P_S is 1k.

Err, 1k <= 4k would still be supported. The way I see it, the only cases
breaking would be ppc with host page size < 4k, and ppcemb with host
page size < 1k (which I'm not aware of).

Is it realistic to expect virtualizing a Mac or pSeries to work on a
1k/2k bamboo? TCG would be unaffected AFAICT.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-03-04 16:46           ` Andreas Färber
  2012-03-04 18:46             ` Alexander Graf
@ 2012-03-04 21:17             ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 36+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-04 21:17 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Marcelo Tosatti, Alexander Graf, qemu-devel, Avi Kivity, anthony,
	David Gibson

On Sun, 2012-03-04 at 17:46 +0100, Andreas Färber wrote:
> Except for ppcemb-softmmu (1k), which is irrelevant for KVM AFAIU.
> 
> Maybe just add an assert and be done with it?

Well, my patch should work anyway, as long as getpagesize() returns a
higher power of two.

The case that wouldn't work (but also doesn't exist) is getpagesize() <
TARGET_PAGE_SIZE.

> For the record, I am hoping to get rid of some of those cpu.h defines
> in a later stage of QOM'ification.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-24  0:23 ` [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size David Gibson
                     ` (2 preceding siblings ...)
  2012-02-28 12:32   ` Avi Kivity
@ 2012-02-28 23:32   ` Alexander Graf
  2012-02-29  0:22     ` David Gibson
  3 siblings, 1 reply; 36+ messages in thread
From: Alexander Graf @ 2012-02-28 23:32 UTC (permalink / raw)
  To: David Gibson; +Cc: Marcelo Tossatti, qemu-devel, anthony, Avi Kivity


On 24.02.2012, at 01:23, David Gibson wrote:

> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> 
> If the kernel page size is larger than TARGET_PAGE_SIZE, which
> happens for example on ppc64 with kernels compiled for 64K pages,
> the dirty tracking doesn't work.
> 
> Cc: Avi Kivity <avi@redhat.com>
> Cc: Marcelo Tossatti <mtossatti@redhat.com>
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> kvm-all.c |    7 ++++---
> 1 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/kvm-all.c b/kvm-all.c
> index 5e188bf..3f8cfd9 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -348,10 +348,11 @@ static int kvm_set_migration_log(int enable)
> static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
>                                          unsigned long *bitmap)
> {
> -    unsigned int i, j;
> +  unsigned int i, j;
>     unsigned long page_number, c;
>     target_phys_addr_t addr, addr1;
>     unsigned int len = ((section->size / TARGET_PAGE_SIZE) + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
> +    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;

Actually, looking at this, don't we rather want to cache hpratio? The way this is implemented, it would mean we'd do a sysctl for every call, right?


Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size
  2012-02-28 23:32   ` Alexander Graf
@ 2012-02-29  0:22     ` David Gibson
  0 siblings, 0 replies; 36+ messages in thread
From: David Gibson @ 2012-02-29  0:22 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Marcelo Tossatti, qemu-devel, anthony, Avi Kivity

On Wed, Feb 29, 2012 at 12:32:51AM +0100, Alexander Graf wrote:
> 
> On 24.02.2012, at 01:23, David Gibson wrote:
> 
> > From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > 
> > If the kernel page size is larger than TARGET_PAGE_SIZE, which
> > happens for example on ppc64 with kernels compiled for 64K pages,
> > the dirty tracking doesn't work.
> > 
> > Cc: Avi Kivity <avi@redhat.com>
> > Cc: Marcelo Tossatti <mtossatti@redhat.com>
> > 
> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > kvm-all.c |    7 ++++---
> > 1 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kvm-all.c b/kvm-all.c
> > index 5e188bf..3f8cfd9 100644
> > --- a/kvm-all.c
> > +++ b/kvm-all.c
> > @@ -348,10 +348,11 @@ static int kvm_set_migration_log(int enable)
> > static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
> >                                          unsigned long *bitmap)
> > {
> > -    unsigned int i, j;
> > +  unsigned int i, j;
> >     unsigned long page_number, c;
> >     target_phys_addr_t addr, addr1;
> >     unsigned int len = ((section->size / TARGET_PAGE_SIZE) + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
> > +    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
> 
> Actually, looking at this, don't we rather want to cache hpratio?
> The way this is implemented, it would mean we'd do a sysctl for
> every call, right?

I think libc already caches this.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2012-03-04 21:45 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-24  0:23 [Qemu-devel] [0/6] Assorted bugfixes David Gibson
2012-02-24  0:23 ` [Qemu-devel] [PATCH 1/6] kvm: Comparison with ioctl number macros needs to be unsigned David Gibson
2012-02-24  0:23 ` [Qemu-devel] [PATCH 2/6] slirp: Fix assertion failure on rejected DHCP requests David Gibson
2012-02-27 13:58   ` Jan Kiszka
2012-02-28  1:07     ` David Gibson
2012-02-24  0:23 ` [Qemu-devel] [PATCH 3/6] USB OHCI bug fixes David Gibson
2012-02-27 14:13   ` Gerd Hoffmann
2012-02-28  3:09     ` David Gibson
2012-02-28  3:37       ` David Gibson
2012-02-24  0:23 ` [Qemu-devel] [PATCH 4/6] Endian fixes for virtfs David Gibson
2012-02-24  8:29   ` Aneesh Kumar K.V
2012-02-24  0:23 ` [Qemu-devel] [PATCH 5/6] Endian fix an assertion in usb-msd David Gibson
2012-02-24  0:23 ` [Qemu-devel] [PATCH 6/6] kvm: Fix dirty tracking with large kernel page size David Gibson
2012-02-24  1:03   ` Benjamin Herrenschmidt
2012-02-26 21:41   ` Blue Swirl
2012-02-27  0:16     ` David Gibson
2012-02-27  0:25       ` Benjamin Herrenschmidt
2012-02-27  0:36     ` Alexander Graf
2012-03-03 15:29       ` Blue Swirl
2012-02-28 12:32   ` Avi Kivity
2012-02-28 21:48     ` Benjamin Herrenschmidt
2012-03-04 10:49       ` Avi Kivity
2012-03-04 11:53         ` Benjamin Herrenschmidt
2012-03-04 12:18           ` Avi Kivity
2012-03-04 16:46           ` Andreas Färber
2012-03-04 18:46             ` Alexander Graf
2012-03-04 20:21               ` Andreas Färber
2012-03-04 20:25                 ` Alexander Graf
2012-03-04 20:31                   ` Andreas Färber
2012-03-04 20:59                     ` Alexander Graf
2012-03-04 21:19                       ` Benjamin Herrenschmidt
2012-03-04 21:45                         ` Alexander Graf
2012-03-04 21:21                       ` Andreas Färber
2012-03-04 21:17             ` Benjamin Herrenschmidt
2012-02-28 23:32   ` Alexander Graf
2012-02-29  0:22     ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).