qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
@ 2014-11-17  4:12 Samuel Mendoza-Jonas
  2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM) Samuel Mendoza-Jonas
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Samuel Mendoza-Jonas @ 2014-11-17  4:12 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: aik, Samuel Mendoza-Jonas

If a spapr guest reboots during a live migration, the guest HTAB on the
destination is not updated properly, usually resulting in a kernel panic.

This is a (delayed!) follow up to my previous patch including a fix
for TCG guests as well as KVM.

Changes from V1:
- Split out overflow fix into separate patch
- Removed unnecessary locks (relevant operations occur under BQL)
- TCG: Set HTAB dirty instead of resetting migration state
- Minor style fixes

Samuel Mendoza-Jonas (3):
  spapr: Fix stale HTAB during live migration (KVM)
  spapr: Fix integer overflow during migration (TCG)
  spapr: Fix stale HTAB during live migration (TCG)

 hw/ppc/spapr.c         | 60 +++++++++++++++++++++++++++++++++++++++++++-------
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 53 insertions(+), 8 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM)
  2014-11-17  4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
@ 2014-11-17  4:12 ` Samuel Mendoza-Jonas
  2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 2/3] spapr: Fix integer overflow during migration (TCG) Samuel Mendoza-Jonas
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Samuel Mendoza-Jonas @ 2014-11-17  4:12 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: aik, Samuel Mendoza-Jonas

If a guest reboots during a running migration, changes to the
hash page table are not necessarily updated on the destination.
Opening a new file descriptor to the HTAB forces the migration
handler to resend the entire table.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
---
 hw/ppc/spapr.c         | 38 ++++++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 39 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0a2bfe6..eb07343 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -833,6 +833,11 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
         /* Kernel handles htab, we don't need to allocate one */
         spapr->htab_shift = shift;
         kvmppc_kern_htab = true;
+
+        /* Tell readers to update their file descriptor */
+        if (spapr->htab_fd >= 0) {
+            spapr->htab_fd_stale = true;
+        }
     } else {
         if (!spapr->htab) {
             /* Allocate an htab if we don't yet have one */
@@ -850,6 +855,28 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
     }
 }
 
+/*
+ * A guest reset will cause spapr->htab_fd to become stale if being used.
+ * Reopen the file descriptor to make sure the whole HTAB is properly read.
+ */
+static int spapr_check_htab_fd(sPAPREnvironment *spapr)
+{
+    int rc = 0;
+
+    if (spapr->htab_fd_stale) {
+        close(spapr->htab_fd);
+        spapr->htab_fd = kvmppc_get_htab_fd(false);
+        if (spapr->htab_fd < 0) {
+            error_report("Unable to open fd for reading hash table from KVM: "
+                    "%s", strerror(errno));
+            rc = -1;
+        }
+        spapr->htab_fd_stale = false;
+    }
+
+    return rc;
+}
+
 static void ppc_spapr_reset(void)
 {
     PowerPCCPU *first_ppc_cpu;
@@ -985,6 +1012,7 @@ static int htab_save_setup(QEMUFile *f, void *opaque)
         assert(kvm_enabled());
 
         spapr->htab_fd = kvmppc_get_htab_fd(false);
+        spapr->htab_fd_stale = false;
         if (spapr->htab_fd < 0) {
             fprintf(stderr, "Unable to open fd for reading hash table from KVM: %s\n",
                     strerror(errno));
@@ -1137,6 +1165,11 @@ static int htab_save_iterate(QEMUFile *f, void *opaque)
     if (!spapr->htab) {
         assert(kvm_enabled());
 
+        rc = spapr_check_htab_fd(spapr);
+        if (rc < 0) {
+            return rc;
+        }
+
         rc = kvmppc_save_htab(f, spapr->htab_fd,
                               MAX_KVM_BUF_SIZE, MAX_ITERATION_NS);
         if (rc < 0) {
@@ -1168,6 +1201,11 @@ static int htab_save_complete(QEMUFile *f, void *opaque)
 
         assert(kvm_enabled());
 
+        rc = spapr_check_htab_fd(spapr);
+        if (rc < 0) {
+            return rc;
+        }
+
         rc = kvmppc_save_htab(f, spapr->htab_fd, MAX_KVM_BUF_SIZE, -1);
         if (rc < 0) {
             return rc;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 749daf4..716bff4 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -37,6 +37,7 @@ typedef struct sPAPREnvironment {
     int htab_save_index;
     bool htab_first_pass;
     int htab_fd;
+    bool htab_fd_stale;
 } sPAPREnvironment;
 
 #define H_SUCCESS         0
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V2 2/3] spapr: Fix integer overflow during migration (TCG)
  2014-11-17  4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
  2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM) Samuel Mendoza-Jonas
@ 2014-11-17  4:12 ` Samuel Mendoza-Jonas
  2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 3/3] spapr: Fix stale HTAB during live " Samuel Mendoza-Jonas
  2014-11-17 12:26 ` [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Alexander Graf
  3 siblings, 0 replies; 7+ messages in thread
From: Samuel Mendoza-Jonas @ 2014-11-17  4:12 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: aik, Samuel Mendoza-Jonas

The n_valid and n_invalid fields are unsigned short integers but it is
possible to have more than 65535 entries in a contiguous hunk, overflowing
the field. This results in an incorrect HTAB being sent to the destination
during migration.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
---
 hw/ppc/spapr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index eb07343..10b7b00 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1045,7 +1045,7 @@ static void htab_save_first_pass(QEMUFile *f, sPAPREnvironment *spapr,
 
         /* Consume valid HPTEs */
         chunkstart = index;
-        while ((index < htabslots)
+        while ((index < htabslots) && (index - chunkstart < USHRT_MAX)
                && HPTE_VALID(HPTE(spapr->htab, index))) {
             index++;
             CLEAN_HPTE(HPTE(spapr->htab, index));
@@ -1097,7 +1097,7 @@ static int htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr,
 
         chunkstart = index;
         /* Consume valid dirty HPTEs */
-        while ((index < htabslots)
+        while ((index < htabslots) && (index - chunkstart < USHRT_MAX)
                && HPTE_DIRTY(HPTE(spapr->htab, index))
                && HPTE_VALID(HPTE(spapr->htab, index))) {
             CLEAN_HPTE(HPTE(spapr->htab, index));
@@ -1107,7 +1107,7 @@ static int htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr,
 
         invalidstart = index;
         /* Consume invalid dirty HPTEs */
-        while ((index < htabslots)
+        while ((index < htabslots) && (index - invalidstart < USHRT_MAX)
                && HPTE_DIRTY(HPTE(spapr->htab, index))
                && !HPTE_VALID(HPTE(spapr->htab, index))) {
             CLEAN_HPTE(HPTE(spapr->htab, index));
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH V2 3/3] spapr: Fix stale HTAB during live migration (TCG)
  2014-11-17  4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
  2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM) Samuel Mendoza-Jonas
  2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 2/3] spapr: Fix integer overflow during migration (TCG) Samuel Mendoza-Jonas
@ 2014-11-17  4:12 ` Samuel Mendoza-Jonas
  2014-11-17 12:26 ` [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Alexander Graf
  3 siblings, 0 replies; 7+ messages in thread
From: Samuel Mendoza-Jonas @ 2014-11-17  4:12 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel; +Cc: aik, Samuel Mendoza-Jonas

If a TCG guest reboots during a running migration HTAB entries are not
marked dirty, and the destination boots with an invalid HTAB.

When a reboot occurs, explicitly mark the current HTAB dirty after
clearing it.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
---
 hw/ppc/spapr.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 10b7b00..c2efbcd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -819,9 +819,16 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
     }
 }
 
+#define HPTE(_table, _i)   (void *)(((uint64_t *)(_table)) + ((_i) * 2))
+#define HPTE_VALID(_hpte)  (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_VALID)
+#define HPTE_DIRTY(_hpte)  (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_HPTE_DIRTY)
+#define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= tswap64(~HPTE64_V_HPTE_DIRTY))
+#define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= tswap64(HPTE64_V_HPTE_DIRTY))
+
 static void spapr_reset_htab(sPAPREnvironment *spapr)
 {
     long shift;
+    int index;
 
     /* allocate hash page table.  For now we always make this 16mb,
      * later we should probably make it scale to the size of guest
@@ -846,6 +853,10 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
 
         /* And clear it */
         memset(spapr->htab, 0, HTAB_SIZE(spapr));
+
+        for (index = 0; index < HTAB_SIZE(spapr) / HASH_PTE_SIZE_64; index++) {
+            DIRTY_HPTE(HPTE(spapr->htab, index));
+        }
     }
 
     /* Update the RMA size if necessary */
@@ -993,11 +1004,6 @@ static const VMStateDescription vmstate_spapr = {
     },
 };
 
-#define HPTE(_table, _i)   (void *)(((uint64_t *)(_table)) + ((_i) * 2))
-#define HPTE_VALID(_hpte)  (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_VALID)
-#define HPTE_DIRTY(_hpte)  (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_HPTE_DIRTY)
-#define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= tswap64(~HPTE64_V_HPTE_DIRTY))
-
 static int htab_save_setup(QEMUFile *f, void *opaque)
 {
     sPAPREnvironment *spapr = opaque;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
  2014-11-17  4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
                   ` (2 preceding siblings ...)
  2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 3/3] spapr: Fix stale HTAB during live " Samuel Mendoza-Jonas
@ 2014-11-17 12:26 ` Alexander Graf
  2014-11-24  6:48   ` Alexey Kardashevskiy
  3 siblings, 1 reply; 7+ messages in thread
From: Alexander Graf @ 2014-11-17 12:26 UTC (permalink / raw)
  To: Samuel Mendoza-Jonas
  Cc: Alexey Kardashevskiy, qemu-ppc@nongnu.org, qemu-devel@nongnu.org




> Am 17.11.2014 um 05:12 schrieb Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>:
> 
> If a spapr guest reboots during a live migration, the guest HTAB on the
> destination is not updated properly, usually resulting in a kernel panic.
> 
> This is a (delayed!) follow up to my previous patch including a fix
> for TCG guests as well as KVM.
> 
> Changes from V1:
> - Split out overflow fix into separate patch
> - Removed unnecessary locks (relevant operations occur under BQL)
> - TCG: Set HTAB dirty instead of resetting migration state
> - Minor style fixes

Looks great to me, but I would like to get a reviewed-by from Alexey as well ;)


Alex

> 
> Samuel Mendoza-Jonas (3):
>  spapr: Fix stale HTAB during live migration (KVM)
>  spapr: Fix integer overflow during migration (TCG)
>  spapr: Fix stale HTAB during live migration (TCG)
> 
> hw/ppc/spapr.c         | 60 +++++++++++++++++++++++++++++++++++++++++++-------
> include/hw/ppc/spapr.h |  1 +
> 2 files changed, 53 insertions(+), 8 deletions(-)
> 
> -- 
> 1.9.3
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
  2014-11-17 12:26 ` [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Alexander Graf
@ 2014-11-24  6:48   ` Alexey Kardashevskiy
  2014-11-24  9:48     ` Alexander Graf
  0 siblings, 1 reply; 7+ messages in thread
From: Alexey Kardashevskiy @ 2014-11-24  6:48 UTC (permalink / raw)
  To: Alexander Graf, Samuel Mendoza-Jonas
  Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org

On 11/17/2014 11:26 PM, Alexander Graf wrote:
> 
> 
> 
>> Am 17.11.2014 um 05:12 schrieb Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>:
>>
>> If a spapr guest reboots during a live migration, the guest HTAB on the
>> destination is not updated properly, usually resulting in a kernel panic.
>>
>> This is a (delayed!) follow up to my previous patch including a fix
>> for TCG guests as well as KVM.
>>
>> Changes from V1:
>> - Split out overflow fix into separate patch
>> - Removed unnecessary locks (relevant operations occur under BQL)
>> - TCG: Set HTAB dirty instead of resetting migration state
>> - Minor style fixes
> 
> Looks great to me, but I would like to get a reviewed-by from Alexey as well ;)

Looks good to me too.

Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>


> 
> 
> Alex
> 
>>
>> Samuel Mendoza-Jonas (3):
>>  spapr: Fix stale HTAB during live migration (KVM)
>>  spapr: Fix integer overflow during migration (TCG)
>>  spapr: Fix stale HTAB during live migration (TCG)
>>
>> hw/ppc/spapr.c         | 60 +++++++++++++++++++++++++++++++++++++++++++-------
>> include/hw/ppc/spapr.h |  1 +
>> 2 files changed, 53 insertions(+), 8 deletions(-)
>>
>> -- 
>> 1.9.3
>>
>>


-- 
Alexey

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
  2014-11-24  6:48   ` Alexey Kardashevskiy
@ 2014-11-24  9:48     ` Alexander Graf
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Graf @ 2014-11-24  9:48 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Samuel Mendoza-Jonas
  Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org



On 24.11.14 07:48, Alexey Kardashevskiy wrote:
> On 11/17/2014 11:26 PM, Alexander Graf wrote:
>>
>>
>>
>>> Am 17.11.2014 um 05:12 schrieb Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>:
>>>
>>> If a spapr guest reboots during a live migration, the guest HTAB on the
>>> destination is not updated properly, usually resulting in a kernel panic.
>>>
>>> This is a (delayed!) follow up to my previous patch including a fix
>>> for TCG guests as well as KVM.
>>>
>>> Changes from V1:
>>> - Split out overflow fix into separate patch
>>> - Removed unnecessary locks (relevant operations occur under BQL)
>>> - TCG: Set HTAB dirty instead of resetting migration state
>>> - Minor style fixes
>>
>> Looks great to me, but I would like to get a reviewed-by from Alexey as well ;)
> 
> Looks good to me too.
> 
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Thanks, applied all to ppc-next-2.3.


Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-11-24  9:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-17  4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM) Samuel Mendoza-Jonas
2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 2/3] spapr: Fix integer overflow during migration (TCG) Samuel Mendoza-Jonas
2014-11-17  4:12 ` [Qemu-devel] [PATCH V2 3/3] spapr: Fix stale HTAB during live " Samuel Mendoza-Jonas
2014-11-17 12:26 ` [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Alexander Graf
2014-11-24  6:48   ` Alexey Kardashevskiy
2014-11-24  9:48     ` Alexander Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).