* [PATCH] hw/char/pl011: Fix clock migration failure
@ 2021-03-17 4:44 Gavin Shan
2021-03-17 9:09 ` Peter Maydell
0 siblings, 1 reply; 10+ messages in thread
From: Gavin Shan @ 2021-03-17 4:44 UTC (permalink / raw)
To: qemu-arm
Cc: peter.maydell, luc, qemu-devel, f4bug, shan.gavin, pbonzini,
marcandre.lureau
There is a added clock to trace buad rate change since v5.2.0 by
commit aac63e0e6ea3 ("hw/char/pl011: add a clock input"). The added
clock causes migration failure. For example, migration from v5.2.0
to v5.1.0 can fail with the following error messages:
qemu-system-aarch64: error while loading state for instance 0x0 \
of device 'pl011'
qemu-system-aarch64: load of migration failed: No such file or \
directory
This fixes the issue by reporting the baud rate change at post load
time so that the clock won't be migrated by sub-section to avoid the
migration failure.
Cc: qemu-stable@nongnu.org
Fixes: aac63e0e6ea3 ("hw/char/pl011: add a clock input")
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
hw/char/pl011.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/hw/char/pl011.c b/hw/char/pl011.c
index c5621a195f..401bd28536 100644
--- a/hw/char/pl011.c
+++ b/hw/char/pl011.c
@@ -322,20 +322,20 @@ static const MemoryRegionOps pl011_ops = {
.endianness = DEVICE_NATIVE_ENDIAN,
};
-static const VMStateDescription vmstate_pl011_clock = {
- .name = "pl011/clock",
- .version_id = 1,
- .minimum_version_id = 1,
- .fields = (VMStateField[]) {
- VMSTATE_CLOCK(clk, PL011State),
- VMSTATE_END_OF_LIST()
- }
-};
+static int pl011_post_load(void *opaque, int version_id)
+{
+ PL011State *s = PL011(opaque);
+
+ pl011_trace_baudrate_change(s);
+
+ return 0;
+}
static const VMStateDescription vmstate_pl011 = {
.name = "pl011",
.version_id = 2,
.minimum_version_id = 2,
+ .post_load = pl011_post_load,
.fields = (VMStateField[]) {
VMSTATE_UINT32(readbuff, PL011State),
VMSTATE_UINT32(flags, PL011State),
@@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
VMSTATE_INT32(read_trigger, PL011State),
VMSTATE_END_OF_LIST()
},
- .subsections = (const VMStateDescription * []) {
- &vmstate_pl011_clock,
- NULL
- }
};
static Property pl011_properties[] = {
--
2.23.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 4:44 [PATCH] hw/char/pl011: Fix clock migration failure Gavin Shan
@ 2021-03-17 9:09 ` Peter Maydell
2021-03-17 10:37 ` Gavin Shan
0 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2021-03-17 9:09 UTC (permalink / raw)
To: Gavin Shan
Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau
On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>
> There is a added clock to trace buad rate change since v5.2.0 by
> commit aac63e0e6ea3 ("hw/char/pl011: add a clock input"). The added
> clock causes migration failure. For example, migration from v5.2.0
> to v5.1.0 can fail with the following error messages:
>
> qemu-system-aarch64: error while loading state for instance 0x0 \
> of device 'pl011'
> qemu-system-aarch64: load of migration failed: No such file or \
> directory
>
> This fixes the issue by reporting the baud rate change at post load
> time so that the clock won't be migrated by sub-section to avoid the
> migration failure.
>
> Cc: qemu-stable@nongnu.org
> Fixes: aac63e0e6ea3 ("hw/char/pl011: add a clock input")
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
> hw/char/pl011.c | 22 +++++++++-------------
> 1 file changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/hw/char/pl011.c b/hw/char/pl011.c
> index c5621a195f..401bd28536 100644
> --- a/hw/char/pl011.c
> +++ b/hw/char/pl011.c
> @@ -322,20 +322,20 @@ static const MemoryRegionOps pl011_ops = {
> .endianness = DEVICE_NATIVE_ENDIAN,
> };
>
> -static const VMStateDescription vmstate_pl011_clock = {
> - .name = "pl011/clock",
> - .version_id = 1,
> - .minimum_version_id = 1,
> - .fields = (VMStateField[]) {
> - VMSTATE_CLOCK(clk, PL011State),
> - VMSTATE_END_OF_LIST()
> - }
> -};
> +static int pl011_post_load(void *opaque, int version_id)
> +{
> + PL011State *s = PL011(opaque);
> +
> + pl011_trace_baudrate_change(s);
> +
> + return 0;
> +}
>
> static const VMStateDescription vmstate_pl011 = {
> .name = "pl011",
> .version_id = 2,
> .minimum_version_id = 2,
> + .post_load = pl011_post_load,
> .fields = (VMStateField[]) {
> VMSTATE_UINT32(readbuff, PL011State),
> VMSTATE_UINT32(flags, PL011State),
> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
> VMSTATE_INT32(read_trigger, PL011State),
> VMSTATE_END_OF_LIST()
> },
> - .subsections = (const VMStateDescription * []) {
> - &vmstate_pl011_clock,
> - NULL
> - }
> };
Doesn't dropping the subsection break migration compat ?
thanks
-- PMM
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 9:09 ` Peter Maydell
@ 2021-03-17 10:37 ` Gavin Shan
2021-03-17 10:40 ` Peter Maydell
0 siblings, 1 reply; 10+ messages in thread
From: Gavin Shan @ 2021-03-17 10:37 UTC (permalink / raw)
To: Peter Maydell
Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau
Hi Peter,
On 3/17/21 8:09 PM, Peter Maydell wrote:
> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>>
>> There is a added clock to trace buad rate change since v5.2.0 by
>> commit aac63e0e6ea3 ("hw/char/pl011: add a clock input"). The added
>> clock causes migration failure. For example, migration from v5.2.0
>> to v5.1.0 can fail with the following error messages:
>>
>> qemu-system-aarch64: error while loading state for instance 0x0 \
>> of device 'pl011'
>> qemu-system-aarch64: load of migration failed: No such file or \
>> directory
>>
>> This fixes the issue by reporting the baud rate change at post load
>> time so that the clock won't be migrated by sub-section to avoid the
>> migration failure.
>>
>> Cc: qemu-stable@nongnu.org
>> Fixes: aac63e0e6ea3 ("hw/char/pl011: add a clock input")
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>> hw/char/pl011.c | 22 +++++++++-------------
>> 1 file changed, 9 insertions(+), 13 deletions(-)
>>
>> diff --git a/hw/char/pl011.c b/hw/char/pl011.c
>> index c5621a195f..401bd28536 100644
>> --- a/hw/char/pl011.c
>> +++ b/hw/char/pl011.c
>> @@ -322,20 +322,20 @@ static const MemoryRegionOps pl011_ops = {
>> .endianness = DEVICE_NATIVE_ENDIAN,
>> };
>>
>> -static const VMStateDescription vmstate_pl011_clock = {
>> - .name = "pl011/clock",
>> - .version_id = 1,
>> - .minimum_version_id = 1,
>> - .fields = (VMStateField[]) {
>> - VMSTATE_CLOCK(clk, PL011State),
>> - VMSTATE_END_OF_LIST()
>> - }
>> -};
>> +static int pl011_post_load(void *opaque, int version_id)
>> +{
>> + PL011State *s = PL011(opaque);
>> +
>> + pl011_trace_baudrate_change(s);
>> +
>> + return 0;
>> +}
>>
>> static const VMStateDescription vmstate_pl011 = {
>> .name = "pl011",
>> .version_id = 2,
>> .minimum_version_id = 2,
>> + .post_load = pl011_post_load,
>> .fields = (VMStateField[]) {
>> VMSTATE_UINT32(readbuff, PL011State),
>> VMSTATE_UINT32(flags, PL011State),
>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>> VMSTATE_INT32(read_trigger, PL011State),
>> VMSTATE_END_OF_LIST()
>> },
>> - .subsections = (const VMStateDescription * []) {
>> - &vmstate_pl011_clock,
>> - NULL
>> - }
>> };
>
> Doesn't dropping the subsection break migration compat ?
>
It's why this patch needs to be backported to stable branches.
In that way, we won't have migration compatible issue.
Thanks,
Gavin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 10:37 ` Gavin Shan
@ 2021-03-17 10:40 ` Peter Maydell
2021-03-17 10:59 ` Gavin Shan
0 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2021-03-17 10:40 UTC (permalink / raw)
To: Gavin Shan
Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau
On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
>
> Hi Peter,
>
> On 3/17/21 8:09 PM, Peter Maydell wrote:
> > On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
> >>
> >> static const VMStateDescription vmstate_pl011 = {
> >> .name = "pl011",
> >> .version_id = 2,
> >> .minimum_version_id = 2,
> >> + .post_load = pl011_post_load,
> >> .fields = (VMStateField[]) {
> >> VMSTATE_UINT32(readbuff, PL011State),
> >> VMSTATE_UINT32(flags, PL011State),
> >> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
> >> VMSTATE_INT32(read_trigger, PL011State),
> >> VMSTATE_END_OF_LIST()
> >> },
> >> - .subsections = (const VMStateDescription * []) {
> >> - &vmstate_pl011_clock,
> >> - NULL
> >> - }
> >> };
> >
> > Doesn't dropping the subsection break migration compat ?
> >
>
> It's why this patch needs to be backported to stable branches.
> In that way, we won't have migration compatible issue.
No, migration has to work from the existing already
shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
the correct "virt-5.2" &c versioned machine type.)
thanks
-- PMM
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 10:40 ` Peter Maydell
@ 2021-03-17 10:59 ` Gavin Shan
2021-03-17 11:14 ` Peter Maydell
0 siblings, 1 reply; 10+ messages in thread
From: Gavin Shan @ 2021-03-17 10:59 UTC (permalink / raw)
To: Peter Maydell
Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau
Hi Peter,
On 3/17/21 9:40 PM, Peter Maydell wrote:
> On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
>> On 3/17/21 8:09 PM, Peter Maydell wrote:
>>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>>>>
>>>> static const VMStateDescription vmstate_pl011 = {
>>>> .name = "pl011",
>>>> .version_id = 2,
>>>> .minimum_version_id = 2,
>>>> + .post_load = pl011_post_load,
>>>> .fields = (VMStateField[]) {
>>>> VMSTATE_UINT32(readbuff, PL011State),
>>>> VMSTATE_UINT32(flags, PL011State),
>>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>>>> VMSTATE_INT32(read_trigger, PL011State),
>>>> VMSTATE_END_OF_LIST()
>>>> },
>>>> - .subsections = (const VMStateDescription * []) {
>>>> - &vmstate_pl011_clock,
>>>> - NULL
>>>> - }
>>>> };
>>>
>>> Doesn't dropping the subsection break migration compat ?
>>>
>>
>> It's why this patch needs to be backported to stable branches.
>> In that way, we won't have migration compatible issue.
>
> No, migration has to work from the existing already
> shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
> the correct "virt-5.2" &c versioned machine type.)
>
Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
to v5.2.0. The migration failure happens during migration from v6.0
to v5.1 with machine type as "virt-5.1", instead of migrating from
v5.1 to v6.0. One question is if we need support backwards migration?
If we do support backwards migration, I think there are two options:
(1) merge this patch and backport it to v5.2+; (2) Backport commit
aac63e0e6ea3 to v5.2-. I guess (1) would be right way to go because
it's less effort than (2). Besides, the clock needn't to be migrated
if I'm correct.
Thanks,
Gavin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 10:59 ` Gavin Shan
@ 2021-03-17 11:14 ` Peter Maydell
2021-03-17 12:54 ` Andrew Jones
0 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2021-03-17 11:14 UTC (permalink / raw)
To: Gavin Shan
Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau
On Wed, 17 Mar 2021 at 10:59, Gavin Shan <gshan@redhat.com> wrote:
>
> Hi Peter,
>
> On 3/17/21 9:40 PM, Peter Maydell wrote:
> > On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
> >> On 3/17/21 8:09 PM, Peter Maydell wrote:
> >>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
> >>>>
> >>>> static const VMStateDescription vmstate_pl011 = {
> >>>> .name = "pl011",
> >>>> .version_id = 2,
> >>>> .minimum_version_id = 2,
> >>>> + .post_load = pl011_post_load,
> >>>> .fields = (VMStateField[]) {
> >>>> VMSTATE_UINT32(readbuff, PL011State),
> >>>> VMSTATE_UINT32(flags, PL011State),
> >>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
> >>>> VMSTATE_INT32(read_trigger, PL011State),
> >>>> VMSTATE_END_OF_LIST()
> >>>> },
> >>>> - .subsections = (const VMStateDescription * []) {
> >>>> - &vmstate_pl011_clock,
> >>>> - NULL
> >>>> - }
> >>>> };
> >>>
> >>> Doesn't dropping the subsection break migration compat ?
> >>>
> >>
> >> It's why this patch needs to be backported to stable branches.
> >> In that way, we won't have migration compatible issue.
> >
> > No, migration has to work from the existing already
> > shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
> > the correct "virt-5.2" &c versioned machine type.)
> >
>
> Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
> to v5.2.0. The migration failure happens during migration from v6.0
> to v5.1 with machine type as "virt-5.1", instead of migrating from
> v5.1 to v6.0. One question is if we need support backwards migration?
Upstream doesn't care about backwards migration. AIUI
RedHat as a downstream care about the backwards-migration
case in some specific situations, but I don't know if that
would include this one.
I misread the commit message of this patch and hadn't
realised that it was talking about a 5.2 -> 5.1 migration.
From upstream's point of view, commit aac63e0e6ea3 is fine
because it preserves forwards migration (5.1 -> 5.2).
If there's a change that makes RH-as-a-downstream's life
easier without breaking upstream's forward-compat requirements,
we can look at it: but unless I'm misreading this patch,
it would break 5.2 -> 6.0 migration, because 5.2 sends the
subsection and 6.0 with this patch would say "I don't know
how to deal with this subsection I've been sent".
> If we do support backwards migration, I think there are two options:
> (1) merge this patch and backport it to v5.2+; (2) Backport commit
> aac63e0e6ea3 to v5.2-. I guess (1) would be right way to go because
> it's less effort than (2). Besides, the clock needn't to be migrated
> if I'm correct.
You can't fix a backwards migration issue (if you care about it)
by backporting any patch to anywhere -- you need to deal with what
has already shipped.
thanks
-- PMM
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 11:14 ` Peter Maydell
@ 2021-03-17 12:54 ` Andrew Jones
2021-03-17 13:09 ` Philippe Mathieu-Daudé
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Andrew Jones @ 2021-03-17 12:54 UTC (permalink / raw)
To: Peter Maydell
Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
Marc-André Lureau, qemu-arm, Shan Gavin, Paolo Bonzini,
Gavin Shan
On Wed, Mar 17, 2021 at 11:14:56AM +0000, Peter Maydell wrote:
> On Wed, 17 Mar 2021 at 10:59, Gavin Shan <gshan@redhat.com> wrote:
> >
> > Hi Peter,
> >
> > On 3/17/21 9:40 PM, Peter Maydell wrote:
> > > On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
> > >> On 3/17/21 8:09 PM, Peter Maydell wrote:
> > >>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
> > >>>>
> > >>>> static const VMStateDescription vmstate_pl011 = {
> > >>>> .name = "pl011",
> > >>>> .version_id = 2,
> > >>>> .minimum_version_id = 2,
> > >>>> + .post_load = pl011_post_load,
> > >>>> .fields = (VMStateField[]) {
> > >>>> VMSTATE_UINT32(readbuff, PL011State),
> > >>>> VMSTATE_UINT32(flags, PL011State),
> > >>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
> > >>>> VMSTATE_INT32(read_trigger, PL011State),
> > >>>> VMSTATE_END_OF_LIST()
> > >>>> },
> > >>>> - .subsections = (const VMStateDescription * []) {
> > >>>> - &vmstate_pl011_clock,
> > >>>> - NULL
> > >>>> - }
> > >>>> };
> > >>>
> > >>> Doesn't dropping the subsection break migration compat ?
> > >>>
> > >>
> > >> It's why this patch needs to be backported to stable branches.
> > >> In that way, we won't have migration compatible issue.
> > >
> > > No, migration has to work from the existing already
> > > shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
> > > the correct "virt-5.2" &c versioned machine type.)
> > >
> >
> > Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
> > to v5.2.0. The migration failure happens during migration from v6.0
> > to v5.1 with machine type as "virt-5.1", instead of migrating from
> > v5.1 to v6.0. One question is if we need support backwards migration?
>
> Upstream doesn't care about backwards migration. AIUI
> RedHat as a downstream care about the backwards-migration
> case in some specific situations, but I don't know if that
> would include this one.
Right, we do prefer to be able to support "ping-pong" migrations. For
example, if we start a virt-5.1 machine on a 5.1 build of QEMU, and then
migrate it to a 5.2 build of QEMU, we'd like to also be able to go back
to the 5.1 build.
I agree this patch is not the right approach. I think the right approach
is to introduce a compat property and make the "new" section dependent
on it. And then update the hw_compat_* arrays. Gavin, please take a look
at "Connecting subsections to properties" of docs/devel/migration.rst.
I'm also curious what the state of mach-virt's machine types are for
migration. It'd be nice to exhaustively test both forward migration of
all machine types and ping-pong migrations of all machine types. We can
then consider each issue we find (the pessimist in me suggests we'll find
more than this pl011 issue) and how/if we want to resolve them.
Thanks,
drew
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 12:54 ` Andrew Jones
@ 2021-03-17 13:09 ` Philippe Mathieu-Daudé
2021-03-17 13:22 ` Peter Maydell
2021-03-18 2:34 ` Gavin Shan
2 siblings, 0 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-03-17 13:09 UTC (permalink / raw)
To: Andrew Jones, Peter Maydell
Cc: Gavin Shan, Luc Michel, Beraldo Leal, QEMU Developers, qemu-arm,
avocado-devel, Shan Gavin, Paolo Bonzini, Marc-André Lureau
+Beraldo
On 3/17/21 1:54 PM, Andrew Jones wrote:
> On Wed, Mar 17, 2021 at 11:14:56AM +0000, Peter Maydell wrote:
>> On Wed, 17 Mar 2021 at 10:59, Gavin Shan <gshan@redhat.com> wrote:
>>>
>>> Hi Peter,
>>>
>>> On 3/17/21 9:40 PM, Peter Maydell wrote:
>>>> On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
>>>>> On 3/17/21 8:09 PM, Peter Maydell wrote:
>>>>>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>>>>>>>
>>>>>>> static const VMStateDescription vmstate_pl011 = {
>>>>>>> .name = "pl011",
>>>>>>> .version_id = 2,
>>>>>>> .minimum_version_id = 2,
>>>>>>> + .post_load = pl011_post_load,
>>>>>>> .fields = (VMStateField[]) {
>>>>>>> VMSTATE_UINT32(readbuff, PL011State),
>>>>>>> VMSTATE_UINT32(flags, PL011State),
>>>>>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>>>>>>> VMSTATE_INT32(read_trigger, PL011State),
>>>>>>> VMSTATE_END_OF_LIST()
>>>>>>> },
>>>>>>> - .subsections = (const VMStateDescription * []) {
>>>>>>> - &vmstate_pl011_clock,
>>>>>>> - NULL
>>>>>>> - }
>>>>>>> };
>>>>>>
>>>>>> Doesn't dropping the subsection break migration compat ?
>>>>>>
>>>>>
>>>>> It's why this patch needs to be backported to stable branches.
>>>>> In that way, we won't have migration compatible issue.
>>>>
>>>> No, migration has to work from the existing already
>>>> shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
>>>> the correct "virt-5.2" &c versioned machine type.)
>>>>
>>>
>>> Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
>>> to v5.2.0. The migration failure happens during migration from v6.0
>>> to v5.1 with machine type as "virt-5.1", instead of migrating from
>>> v5.1 to v6.0. One question is if we need support backwards migration?
>>
>> Upstream doesn't care about backwards migration. AIUI
>> RedHat as a downstream care about the backwards-migration
>> case in some specific situations, but I don't know if that
>> would include this one.
>
> Right, we do prefer to be able to support "ping-pong" migrations. For
> example, if we start a virt-5.1 machine on a 5.1 build of QEMU, and then
> migrate it to a 5.2 build of QEMU, we'd like to also be able to go back
> to the 5.1 build.
>
> I agree this patch is not the right approach. I think the right approach
> is to introduce a compat property and make the "new" section dependent
> on it. And then update the hw_compat_* arrays. Gavin, please take a look
> at "Connecting subsections to properties" of docs/devel/migration.rst.
>
> I'm also curious what the state of mach-virt's machine types are for
> migration. It'd be nice to exhaustively test both forward migration of
> all machine types and ping-pong migrations of all machine types.
FYI this test has been suggested to the Avocado team few times.
They might already have a ticket to track any progress.
> We can
> then consider each issue we find (the pessimist in me suggests we'll find
> more than this pl011 issue) and how/if we want to resolve them.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 12:54 ` Andrew Jones
2021-03-17 13:09 ` Philippe Mathieu-Daudé
@ 2021-03-17 13:22 ` Peter Maydell
2021-03-18 2:34 ` Gavin Shan
2 siblings, 0 replies; 10+ messages in thread
From: Peter Maydell @ 2021-03-17 13:22 UTC (permalink / raw)
To: Andrew Jones
Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
Marc-André Lureau, qemu-arm, Shan Gavin, Paolo Bonzini,
Gavin Shan
On Wed, 17 Mar 2021 at 12:55, Andrew Jones <drjones@redhat.com> wrote:
> I'm also curious what the state of mach-virt's machine types are for
> migration.
Probably not great -- I don't think anybody is really testing
cross-version migration, and I don't think there's a great
deal of in-practice use of it for Arm either. (See also the
issue with accidentally having env->features in the CPU
migration data, which broke cross-version migration: that
was around a while and we only got one user complaint about it.)
Unless we have a serious test suite for this kind of thing
upstream it's just going to continue to be broken, because at
the moment all we have is "people make best-efforts to think
about migration compat when coding and reviewing, but don't
actually test".
thanks
-- PMM
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hw/char/pl011: Fix clock migration failure
2021-03-17 12:54 ` Andrew Jones
2021-03-17 13:09 ` Philippe Mathieu-Daudé
2021-03-17 13:22 ` Peter Maydell
@ 2021-03-18 2:34 ` Gavin Shan
2 siblings, 0 replies; 10+ messages in thread
From: Gavin Shan @ 2021-03-18 2:34 UTC (permalink / raw)
To: Andrew Jones, Peter Maydell
Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
qemu-arm, Shan Gavin, Marc-André Lureau, Paolo Bonzini
Hi Drew,
On 3/17/21 11:54 PM, Andrew Jones wrote:
> On Wed, Mar 17, 2021 at 11:14:56AM +0000, Peter Maydell wrote:
>> On Wed, 17 Mar 2021 at 10:59, Gavin Shan <gshan@redhat.com> wrote:
>>> On 3/17/21 9:40 PM, Peter Maydell wrote:
>>>> On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
>>>>> On 3/17/21 8:09 PM, Peter Maydell wrote:
>>>>>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>>>>>>>
>>>>>>> static const VMStateDescription vmstate_pl011 = {
>>>>>>> .name = "pl011",
>>>>>>> .version_id = 2,
>>>>>>> .minimum_version_id = 2,
>>>>>>> + .post_load = pl011_post_load,
>>>>>>> .fields = (VMStateField[]) {
>>>>>>> VMSTATE_UINT32(readbuff, PL011State),
>>>>>>> VMSTATE_UINT32(flags, PL011State),
>>>>>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>>>>>>> VMSTATE_INT32(read_trigger, PL011State),
>>>>>>> VMSTATE_END_OF_LIST()
>>>>>>> },
>>>>>>> - .subsections = (const VMStateDescription * []) {
>>>>>>> - &vmstate_pl011_clock,
>>>>>>> - NULL
>>>>>>> - }
>>>>>>> };
>>>>>>
>>>>>> Doesn't dropping the subsection break migration compat ?
>>>>>>
>>>>>
>>>>> It's why this patch needs to be backported to stable branches.
>>>>> In that way, we won't have migration compatible issue.
>>>>
>>>> No, migration has to work from the existing already
>>>> shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
>>>> the correct "virt-5.2" &c versioned machine type.)
>>>>
>>>
>>> Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
>>> to v5.2.0. The migration failure happens during migration from v6.0
>>> to v5.1 with machine type as "virt-5.1", instead of migrating from
>>> v5.1 to v6.0. One question is if we need support backwards migration?
>>
>> Upstream doesn't care about backwards migration. AIUI
>> RedHat as a downstream care about the backwards-migration
>> case in some specific situations, but I don't know if that
>> would include this one.
>
> Right, we do prefer to be able to support "ping-pong" migrations. For
> example, if we start a virt-5.1 machine on a 5.1 build of QEMU, and then
> migrate it to a 5.2 build of QEMU, we'd like to also be able to go back
> to the 5.1 build.
>
> I agree this patch is not the right approach. I think the right approach
> is to introduce a compat property and make the "new" section dependent
> on it. And then update the hw_compat_* arrays. Gavin, please take a look
> at "Connecting subsections to properties" of docs/devel/migration.rst.
>
Agree and thanks for the pointer. I will post another patch to have
something in hw_compat_5_1 to address this particular issue.
> I'm also curious what the state of mach-virt's machine types are for
> migration. It'd be nice to exhaustively test both forward migration of
> all machine types and ping-pong migrations of all machine types. We can
> then consider each issue we find (the pessimist in me suggests we'll find
> more than this pl011 issue) and how/if we want to resolve them.
>
Yeah, I will think about it and do the testing to see if there are more
issues. Also, it'd better to be integrated to existing testing framework
as you suggested.
Thanks,
Gavin
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-03-18 2:35 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-17 4:44 [PATCH] hw/char/pl011: Fix clock migration failure Gavin Shan
2021-03-17 9:09 ` Peter Maydell
2021-03-17 10:37 ` Gavin Shan
2021-03-17 10:40 ` Peter Maydell
2021-03-17 10:59 ` Gavin Shan
2021-03-17 11:14 ` Peter Maydell
2021-03-17 12:54 ` Andrew Jones
2021-03-17 13:09 ` Philippe Mathieu-Daudé
2021-03-17 13:22 ` Peter Maydell
2021-03-18 2:34 ` Gavin Shan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).