* [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
@ 2014-11-17 4:12 Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM) Samuel Mendoza-Jonas
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Samuel Mendoza-Jonas @ 2014-11-17 4:12 UTC (permalink / raw)
To: qemu-ppc, qemu-devel; +Cc: aik, Samuel Mendoza-Jonas
If a spapr guest reboots during a live migration, the guest HTAB on the
destination is not updated properly, usually resulting in a kernel panic.
This is a (delayed!) follow up to my previous patch including a fix
for TCG guests as well as KVM.
Changes from V1:
- Split out overflow fix into separate patch
- Removed unnecessary locks (relevant operations occur under BQL)
- TCG: Set HTAB dirty instead of resetting migration state
- Minor style fixes
Samuel Mendoza-Jonas (3):
spapr: Fix stale HTAB during live migration (KVM)
spapr: Fix integer overflow during migration (TCG)
spapr: Fix stale HTAB during live migration (TCG)
hw/ppc/spapr.c | 60 +++++++++++++++++++++++++++++++++++++++++++-------
include/hw/ppc/spapr.h | 1 +
2 files changed, 53 insertions(+), 8 deletions(-)
--
1.9.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM)
2014-11-17 4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
@ 2014-11-17 4:12 ` Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 2/3] spapr: Fix integer overflow during migration (TCG) Samuel Mendoza-Jonas
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Samuel Mendoza-Jonas @ 2014-11-17 4:12 UTC (permalink / raw)
To: qemu-ppc, qemu-devel; +Cc: aik, Samuel Mendoza-Jonas
If a guest reboots during a running migration, changes to the
hash page table are not necessarily updated on the destination.
Opening a new file descriptor to the HTAB forces the migration
handler to resend the entire table.
Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
---
hw/ppc/spapr.c | 38 ++++++++++++++++++++++++++++++++++++++
include/hw/ppc/spapr.h | 1 +
2 files changed, 39 insertions(+)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0a2bfe6..eb07343 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -833,6 +833,11 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
/* Kernel handles htab, we don't need to allocate one */
spapr->htab_shift = shift;
kvmppc_kern_htab = true;
+
+ /* Tell readers to update their file descriptor */
+ if (spapr->htab_fd >= 0) {
+ spapr->htab_fd_stale = true;
+ }
} else {
if (!spapr->htab) {
/* Allocate an htab if we don't yet have one */
@@ -850,6 +855,28 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
}
}
+/*
+ * A guest reset will cause spapr->htab_fd to become stale if being used.
+ * Reopen the file descriptor to make sure the whole HTAB is properly read.
+ */
+static int spapr_check_htab_fd(sPAPREnvironment *spapr)
+{
+ int rc = 0;
+
+ if (spapr->htab_fd_stale) {
+ close(spapr->htab_fd);
+ spapr->htab_fd = kvmppc_get_htab_fd(false);
+ if (spapr->htab_fd < 0) {
+ error_report("Unable to open fd for reading hash table from KVM: "
+ "%s", strerror(errno));
+ rc = -1;
+ }
+ spapr->htab_fd_stale = false;
+ }
+
+ return rc;
+}
+
static void ppc_spapr_reset(void)
{
PowerPCCPU *first_ppc_cpu;
@@ -985,6 +1012,7 @@ static int htab_save_setup(QEMUFile *f, void *opaque)
assert(kvm_enabled());
spapr->htab_fd = kvmppc_get_htab_fd(false);
+ spapr->htab_fd_stale = false;
if (spapr->htab_fd < 0) {
fprintf(stderr, "Unable to open fd for reading hash table from KVM: %s\n",
strerror(errno));
@@ -1137,6 +1165,11 @@ static int htab_save_iterate(QEMUFile *f, void *opaque)
if (!spapr->htab) {
assert(kvm_enabled());
+ rc = spapr_check_htab_fd(spapr);
+ if (rc < 0) {
+ return rc;
+ }
+
rc = kvmppc_save_htab(f, spapr->htab_fd,
MAX_KVM_BUF_SIZE, MAX_ITERATION_NS);
if (rc < 0) {
@@ -1168,6 +1201,11 @@ static int htab_save_complete(QEMUFile *f, void *opaque)
assert(kvm_enabled());
+ rc = spapr_check_htab_fd(spapr);
+ if (rc < 0) {
+ return rc;
+ }
+
rc = kvmppc_save_htab(f, spapr->htab_fd, MAX_KVM_BUF_SIZE, -1);
if (rc < 0) {
return rc;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 749daf4..716bff4 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -37,6 +37,7 @@ typedef struct sPAPREnvironment {
int htab_save_index;
bool htab_first_pass;
int htab_fd;
+ bool htab_fd_stale;
} sPAPREnvironment;
#define H_SUCCESS 0
--
1.9.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Qemu-devel] [PATCH V2 2/3] spapr: Fix integer overflow during migration (TCG)
2014-11-17 4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM) Samuel Mendoza-Jonas
@ 2014-11-17 4:12 ` Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 3/3] spapr: Fix stale HTAB during live " Samuel Mendoza-Jonas
2014-11-17 12:26 ` [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Alexander Graf
3 siblings, 0 replies; 7+ messages in thread
From: Samuel Mendoza-Jonas @ 2014-11-17 4:12 UTC (permalink / raw)
To: qemu-ppc, qemu-devel; +Cc: aik, Samuel Mendoza-Jonas
The n_valid and n_invalid fields are unsigned short integers but it is
possible to have more than 65535 entries in a contiguous hunk, overflowing
the field. This results in an incorrect HTAB being sent to the destination
during migration.
Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
---
hw/ppc/spapr.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index eb07343..10b7b00 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1045,7 +1045,7 @@ static void htab_save_first_pass(QEMUFile *f, sPAPREnvironment *spapr,
/* Consume valid HPTEs */
chunkstart = index;
- while ((index < htabslots)
+ while ((index < htabslots) && (index - chunkstart < USHRT_MAX)
&& HPTE_VALID(HPTE(spapr->htab, index))) {
index++;
CLEAN_HPTE(HPTE(spapr->htab, index));
@@ -1097,7 +1097,7 @@ static int htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr,
chunkstart = index;
/* Consume valid dirty HPTEs */
- while ((index < htabslots)
+ while ((index < htabslots) && (index - chunkstart < USHRT_MAX)
&& HPTE_DIRTY(HPTE(spapr->htab, index))
&& HPTE_VALID(HPTE(spapr->htab, index))) {
CLEAN_HPTE(HPTE(spapr->htab, index));
@@ -1107,7 +1107,7 @@ static int htab_save_later_pass(QEMUFile *f, sPAPREnvironment *spapr,
invalidstart = index;
/* Consume invalid dirty HPTEs */
- while ((index < htabslots)
+ while ((index < htabslots) && (index - invalidstart < USHRT_MAX)
&& HPTE_DIRTY(HPTE(spapr->htab, index))
&& !HPTE_VALID(HPTE(spapr->htab, index))) {
CLEAN_HPTE(HPTE(spapr->htab, index));
--
1.9.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Qemu-devel] [PATCH V2 3/3] spapr: Fix stale HTAB during live migration (TCG)
2014-11-17 4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM) Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 2/3] spapr: Fix integer overflow during migration (TCG) Samuel Mendoza-Jonas
@ 2014-11-17 4:12 ` Samuel Mendoza-Jonas
2014-11-17 12:26 ` [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Alexander Graf
3 siblings, 0 replies; 7+ messages in thread
From: Samuel Mendoza-Jonas @ 2014-11-17 4:12 UTC (permalink / raw)
To: qemu-ppc, qemu-devel; +Cc: aik, Samuel Mendoza-Jonas
If a TCG guest reboots during a running migration HTAB entries are not
marked dirty, and the destination boots with an invalid HTAB.
When a reboot occurs, explicitly mark the current HTAB dirty after
clearing it.
Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
---
hw/ppc/spapr.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 10b7b00..c2efbcd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -819,9 +819,16 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
}
}
+#define HPTE(_table, _i) (void *)(((uint64_t *)(_table)) + ((_i) * 2))
+#define HPTE_VALID(_hpte) (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_VALID)
+#define HPTE_DIRTY(_hpte) (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_HPTE_DIRTY)
+#define CLEAN_HPTE(_hpte) ((*(uint64_t *)(_hpte)) &= tswap64(~HPTE64_V_HPTE_DIRTY))
+#define DIRTY_HPTE(_hpte) ((*(uint64_t *)(_hpte)) |= tswap64(HPTE64_V_HPTE_DIRTY))
+
static void spapr_reset_htab(sPAPREnvironment *spapr)
{
long shift;
+ int index;
/* allocate hash page table. For now we always make this 16mb,
* later we should probably make it scale to the size of guest
@@ -846,6 +853,10 @@ static void spapr_reset_htab(sPAPREnvironment *spapr)
/* And clear it */
memset(spapr->htab, 0, HTAB_SIZE(spapr));
+
+ for (index = 0; index < HTAB_SIZE(spapr) / HASH_PTE_SIZE_64; index++) {
+ DIRTY_HPTE(HPTE(spapr->htab, index));
+ }
}
/* Update the RMA size if necessary */
@@ -993,11 +1004,6 @@ static const VMStateDescription vmstate_spapr = {
},
};
-#define HPTE(_table, _i) (void *)(((uint64_t *)(_table)) + ((_i) * 2))
-#define HPTE_VALID(_hpte) (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_VALID)
-#define HPTE_DIRTY(_hpte) (tswap64(*((uint64_t *)(_hpte))) & HPTE64_V_HPTE_DIRTY)
-#define CLEAN_HPTE(_hpte) ((*(uint64_t *)(_hpte)) &= tswap64(~HPTE64_V_HPTE_DIRTY))
-
static int htab_save_setup(QEMUFile *f, void *opaque)
{
sPAPREnvironment *spapr = opaque;
--
1.9.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
2014-11-17 4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
` (2 preceding siblings ...)
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 3/3] spapr: Fix stale HTAB during live " Samuel Mendoza-Jonas
@ 2014-11-17 12:26 ` Alexander Graf
2014-11-24 6:48 ` Alexey Kardashevskiy
3 siblings, 1 reply; 7+ messages in thread
From: Alexander Graf @ 2014-11-17 12:26 UTC (permalink / raw)
To: Samuel Mendoza-Jonas
Cc: Alexey Kardashevskiy, qemu-ppc@nongnu.org, qemu-devel@nongnu.org
> Am 17.11.2014 um 05:12 schrieb Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>:
>
> If a spapr guest reboots during a live migration, the guest HTAB on the
> destination is not updated properly, usually resulting in a kernel panic.
>
> This is a (delayed!) follow up to my previous patch including a fix
> for TCG guests as well as KVM.
>
> Changes from V1:
> - Split out overflow fix into separate patch
> - Removed unnecessary locks (relevant operations occur under BQL)
> - TCG: Set HTAB dirty instead of resetting migration state
> - Minor style fixes
Looks great to me, but I would like to get a reviewed-by from Alexey as well ;)
Alex
>
> Samuel Mendoza-Jonas (3):
> spapr: Fix stale HTAB during live migration (KVM)
> spapr: Fix integer overflow during migration (TCG)
> spapr: Fix stale HTAB during live migration (TCG)
>
> hw/ppc/spapr.c | 60 +++++++++++++++++++++++++++++++++++++++++++-------
> include/hw/ppc/spapr.h | 1 +
> 2 files changed, 53 insertions(+), 8 deletions(-)
>
> --
> 1.9.3
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
2014-11-17 12:26 ` [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Alexander Graf
@ 2014-11-24 6:48 ` Alexey Kardashevskiy
2014-11-24 9:48 ` Alexander Graf
0 siblings, 1 reply; 7+ messages in thread
From: Alexey Kardashevskiy @ 2014-11-24 6:48 UTC (permalink / raw)
To: Alexander Graf, Samuel Mendoza-Jonas
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org
On 11/17/2014 11:26 PM, Alexander Graf wrote:
>
>
>
>> Am 17.11.2014 um 05:12 schrieb Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>:
>>
>> If a spapr guest reboots during a live migration, the guest HTAB on the
>> destination is not updated properly, usually resulting in a kernel panic.
>>
>> This is a (delayed!) follow up to my previous patch including a fix
>> for TCG guests as well as KVM.
>>
>> Changes from V1:
>> - Split out overflow fix into separate patch
>> - Removed unnecessary locks (relevant operations occur under BQL)
>> - TCG: Set HTAB dirty instead of resetting migration state
>> - Minor style fixes
>
> Looks great to me, but I would like to get a reviewed-by from Alexey as well ;)
Looks good to me too.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
>
> Alex
>
>>
>> Samuel Mendoza-Jonas (3):
>> spapr: Fix stale HTAB during live migration (KVM)
>> spapr: Fix integer overflow during migration (TCG)
>> spapr: Fix stale HTAB during live migration (TCG)
>>
>> hw/ppc/spapr.c | 60 +++++++++++++++++++++++++++++++++++++++++++-------
>> include/hw/ppc/spapr.h | 1 +
>> 2 files changed, 53 insertions(+), 8 deletions(-)
>>
>> --
>> 1.9.3
>>
>>
--
Alexey
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
2014-11-24 6:48 ` Alexey Kardashevskiy
@ 2014-11-24 9:48 ` Alexander Graf
0 siblings, 0 replies; 7+ messages in thread
From: Alexander Graf @ 2014-11-24 9:48 UTC (permalink / raw)
To: Alexey Kardashevskiy, Samuel Mendoza-Jonas
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org
On 24.11.14 07:48, Alexey Kardashevskiy wrote:
> On 11/17/2014 11:26 PM, Alexander Graf wrote:
>>
>>
>>
>>> Am 17.11.2014 um 05:12 schrieb Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>:
>>>
>>> If a spapr guest reboots during a live migration, the guest HTAB on the
>>> destination is not updated properly, usually resulting in a kernel panic.
>>>
>>> This is a (delayed!) follow up to my previous patch including a fix
>>> for TCG guests as well as KVM.
>>>
>>> Changes from V1:
>>> - Split out overflow fix into separate patch
>>> - Removed unnecessary locks (relevant operations occur under BQL)
>>> - TCG: Set HTAB dirty instead of resetting migration state
>>> - Minor style fixes
>>
>> Looks great to me, but I would like to get a reviewed-by from Alexey as well ;)
>
> Looks good to me too.
>
> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Thanks, applied all to ppc-next-2.3.
Alex
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-11-24 9:48 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-17 4:12 [Qemu-devel] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 1/3] spapr: Fix stale HTAB during live migration (KVM) Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 2/3] spapr: Fix integer overflow during migration (TCG) Samuel Mendoza-Jonas
2014-11-17 4:12 ` [Qemu-devel] [PATCH V2 3/3] spapr: Fix stale HTAB during live " Samuel Mendoza-Jonas
2014-11-17 12:26 ` [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration Alexander Graf
2014-11-24 6:48 ` Alexey Kardashevskiy
2014-11-24 9:48 ` Alexander Graf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).