qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] hw/char/pl011: Fix clock migration failure
@ 2021-03-17  4:44 Gavin Shan
  2021-03-17  9:09 ` Peter Maydell
  0 siblings, 1 reply; 10+ messages in thread
From: Gavin Shan @ 2021-03-17  4:44 UTC (permalink / raw)
  To: qemu-arm
  Cc: peter.maydell, luc, qemu-devel, f4bug, shan.gavin, pbonzini,
	marcandre.lureau

There is a added clock to trace buad rate change since v5.2.0 by
commit aac63e0e6ea3 ("hw/char/pl011: add a clock input"). The added
clock causes migration failure. For example, migration from v5.2.0
to v5.1.0 can fail with the following error messages:

   qemu-system-aarch64: error while loading state for instance 0x0 \
                        of device 'pl011'
   qemu-system-aarch64: load of migration failed: No such file or \
                        directory

This fixes the issue by reporting the baud rate change at post load
time so that the clock won't be migrated by sub-section to avoid the
migration failure.

Cc: qemu-stable@nongnu.org
Fixes: aac63e0e6ea3 ("hw/char/pl011: add a clock input")
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 hw/char/pl011.c | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/hw/char/pl011.c b/hw/char/pl011.c
index c5621a195f..401bd28536 100644
--- a/hw/char/pl011.c
+++ b/hw/char/pl011.c
@@ -322,20 +322,20 @@ static const MemoryRegionOps pl011_ops = {
     .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static const VMStateDescription vmstate_pl011_clock = {
-    .name = "pl011/clock",
-    .version_id = 1,
-    .minimum_version_id = 1,
-    .fields = (VMStateField[]) {
-        VMSTATE_CLOCK(clk, PL011State),
-        VMSTATE_END_OF_LIST()
-    }
-};
+static int pl011_post_load(void *opaque, int version_id)
+{
+    PL011State *s = PL011(opaque);
+
+    pl011_trace_baudrate_change(s);
+
+    return 0;
+}
 
 static const VMStateDescription vmstate_pl011 = {
     .name = "pl011",
     .version_id = 2,
     .minimum_version_id = 2,
+    .post_load = pl011_post_load,
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(readbuff, PL011State),
         VMSTATE_UINT32(flags, PL011State),
@@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
         VMSTATE_INT32(read_trigger, PL011State),
         VMSTATE_END_OF_LIST()
     },
-    .subsections = (const VMStateDescription * []) {
-        &vmstate_pl011_clock,
-        NULL
-    }
 };
 
 static Property pl011_properties[] = {
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17  4:44 [PATCH] hw/char/pl011: Fix clock migration failure Gavin Shan
@ 2021-03-17  9:09 ` Peter Maydell
  2021-03-17 10:37   ` Gavin Shan
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2021-03-17  9:09 UTC (permalink / raw)
  To: Gavin Shan
  Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
	qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau

On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>
> There is a added clock to trace buad rate change since v5.2.0 by
> commit aac63e0e6ea3 ("hw/char/pl011: add a clock input"). The added
> clock causes migration failure. For example, migration from v5.2.0
> to v5.1.0 can fail with the following error messages:
>
>    qemu-system-aarch64: error while loading state for instance 0x0 \
>                         of device 'pl011'
>    qemu-system-aarch64: load of migration failed: No such file or \
>                         directory
>
> This fixes the issue by reporting the baud rate change at post load
> time so that the clock won't be migrated by sub-section to avoid the
> migration failure.
>
> Cc: qemu-stable@nongnu.org
> Fixes: aac63e0e6ea3 ("hw/char/pl011: add a clock input")
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  hw/char/pl011.c | 22 +++++++++-------------
>  1 file changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/hw/char/pl011.c b/hw/char/pl011.c
> index c5621a195f..401bd28536 100644
> --- a/hw/char/pl011.c
> +++ b/hw/char/pl011.c
> @@ -322,20 +322,20 @@ static const MemoryRegionOps pl011_ops = {
>      .endianness = DEVICE_NATIVE_ENDIAN,
>  };
>
> -static const VMStateDescription vmstate_pl011_clock = {
> -    .name = "pl011/clock",
> -    .version_id = 1,
> -    .minimum_version_id = 1,
> -    .fields = (VMStateField[]) {
> -        VMSTATE_CLOCK(clk, PL011State),
> -        VMSTATE_END_OF_LIST()
> -    }
> -};
> +static int pl011_post_load(void *opaque, int version_id)
> +{
> +    PL011State *s = PL011(opaque);
> +
> +    pl011_trace_baudrate_change(s);
> +
> +    return 0;
> +}
>
>  static const VMStateDescription vmstate_pl011 = {
>      .name = "pl011",
>      .version_id = 2,
>      .minimum_version_id = 2,
> +    .post_load = pl011_post_load,
>      .fields = (VMStateField[]) {
>          VMSTATE_UINT32(readbuff, PL011State),
>          VMSTATE_UINT32(flags, PL011State),
> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>          VMSTATE_INT32(read_trigger, PL011State),
>          VMSTATE_END_OF_LIST()
>      },
> -    .subsections = (const VMStateDescription * []) {
> -        &vmstate_pl011_clock,
> -        NULL
> -    }
>  };

Doesn't dropping the subsection break migration compat ?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17  9:09 ` Peter Maydell
@ 2021-03-17 10:37   ` Gavin Shan
  2021-03-17 10:40     ` Peter Maydell
  0 siblings, 1 reply; 10+ messages in thread
From: Gavin Shan @ 2021-03-17 10:37 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
	qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau

Hi Peter,

On 3/17/21 8:09 PM, Peter Maydell wrote:
> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>>
>> There is a added clock to trace buad rate change since v5.2.0 by
>> commit aac63e0e6ea3 ("hw/char/pl011: add a clock input"). The added
>> clock causes migration failure. For example, migration from v5.2.0
>> to v5.1.0 can fail with the following error messages:
>>
>>     qemu-system-aarch64: error while loading state for instance 0x0 \
>>                          of device 'pl011'
>>     qemu-system-aarch64: load of migration failed: No such file or \
>>                          directory
>>
>> This fixes the issue by reporting the baud rate change at post load
>> time so that the clock won't be migrated by sub-section to avoid the
>> migration failure.
>>
>> Cc: qemu-stable@nongnu.org
>> Fixes: aac63e0e6ea3 ("hw/char/pl011: add a clock input")
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   hw/char/pl011.c | 22 +++++++++-------------
>>   1 file changed, 9 insertions(+), 13 deletions(-)
>>
>> diff --git a/hw/char/pl011.c b/hw/char/pl011.c
>> index c5621a195f..401bd28536 100644
>> --- a/hw/char/pl011.c
>> +++ b/hw/char/pl011.c
>> @@ -322,20 +322,20 @@ static const MemoryRegionOps pl011_ops = {
>>       .endianness = DEVICE_NATIVE_ENDIAN,
>>   };
>>
>> -static const VMStateDescription vmstate_pl011_clock = {
>> -    .name = "pl011/clock",
>> -    .version_id = 1,
>> -    .minimum_version_id = 1,
>> -    .fields = (VMStateField[]) {
>> -        VMSTATE_CLOCK(clk, PL011State),
>> -        VMSTATE_END_OF_LIST()
>> -    }
>> -};
>> +static int pl011_post_load(void *opaque, int version_id)
>> +{
>> +    PL011State *s = PL011(opaque);
>> +
>> +    pl011_trace_baudrate_change(s);
>> +
>> +    return 0;
>> +}
>>
>>   static const VMStateDescription vmstate_pl011 = {
>>       .name = "pl011",
>>       .version_id = 2,
>>       .minimum_version_id = 2,
>> +    .post_load = pl011_post_load,
>>       .fields = (VMStateField[]) {
>>           VMSTATE_UINT32(readbuff, PL011State),
>>           VMSTATE_UINT32(flags, PL011State),
>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>>           VMSTATE_INT32(read_trigger, PL011State),
>>           VMSTATE_END_OF_LIST()
>>       },
>> -    .subsections = (const VMStateDescription * []) {
>> -        &vmstate_pl011_clock,
>> -        NULL
>> -    }
>>   };
> 
> Doesn't dropping the subsection break migration compat ?
> 

It's why this patch needs to be backported to stable branches.
In that way, we won't have migration compatible issue.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17 10:37   ` Gavin Shan
@ 2021-03-17 10:40     ` Peter Maydell
  2021-03-17 10:59       ` Gavin Shan
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2021-03-17 10:40 UTC (permalink / raw)
  To: Gavin Shan
  Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
	qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau

On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
>
> Hi Peter,
>
> On 3/17/21 8:09 PM, Peter Maydell wrote:
> > On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
> >>
> >>   static const VMStateDescription vmstate_pl011 = {
> >>       .name = "pl011",
> >>       .version_id = 2,
> >>       .minimum_version_id = 2,
> >> +    .post_load = pl011_post_load,
> >>       .fields = (VMStateField[]) {
> >>           VMSTATE_UINT32(readbuff, PL011State),
> >>           VMSTATE_UINT32(flags, PL011State),
> >> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
> >>           VMSTATE_INT32(read_trigger, PL011State),
> >>           VMSTATE_END_OF_LIST()
> >>       },
> >> -    .subsections = (const VMStateDescription * []) {
> >> -        &vmstate_pl011_clock,
> >> -        NULL
> >> -    }
> >>   };
> >
> > Doesn't dropping the subsection break migration compat ?
> >
>
> It's why this patch needs to be backported to stable branches.
> In that way, we won't have migration compatible issue.

No, migration has to work from the existing already
shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
the correct "virt-5.2" &c versioned machine type.)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17 10:40     ` Peter Maydell
@ 2021-03-17 10:59       ` Gavin Shan
  2021-03-17 11:14         ` Peter Maydell
  0 siblings, 1 reply; 10+ messages in thread
From: Gavin Shan @ 2021-03-17 10:59 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
	qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau

Hi Peter,

On 3/17/21 9:40 PM, Peter Maydell wrote:
> On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
>> On 3/17/21 8:09 PM, Peter Maydell wrote:
>>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>>>>
>>>>    static const VMStateDescription vmstate_pl011 = {
>>>>        .name = "pl011",
>>>>        .version_id = 2,
>>>>        .minimum_version_id = 2,
>>>> +    .post_load = pl011_post_load,
>>>>        .fields = (VMStateField[]) {
>>>>            VMSTATE_UINT32(readbuff, PL011State),
>>>>            VMSTATE_UINT32(flags, PL011State),
>>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>>>>            VMSTATE_INT32(read_trigger, PL011State),
>>>>            VMSTATE_END_OF_LIST()
>>>>        },
>>>> -    .subsections = (const VMStateDescription * []) {
>>>> -        &vmstate_pl011_clock,
>>>> -        NULL
>>>> -    }
>>>>    };
>>>
>>> Doesn't dropping the subsection break migration compat ?
>>>
>>
>> It's why this patch needs to be backported to stable branches.
>> In that way, we won't have migration compatible issue.
> 
> No, migration has to work from the existing already
> shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
> the correct "virt-5.2" &c versioned machine type.)
> 

Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
to v5.2.0. The migration failure happens during migration from v6.0
to v5.1 with machine type as "virt-5.1", instead of migrating from
v5.1 to v6.0. One question is if we need support backwards migration?

If we do support backwards migration, I think there are two options:
(1) merge this patch and backport it to v5.2+; (2) Backport commit
aac63e0e6ea3 to v5.2-. I guess (1) would be right way to go because
it's less effort than (2). Besides, the clock needn't to be migrated
if I'm correct.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17 10:59       ` Gavin Shan
@ 2021-03-17 11:14         ` Peter Maydell
  2021-03-17 12:54           ` Andrew Jones
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Maydell @ 2021-03-17 11:14 UTC (permalink / raw)
  To: Gavin Shan
  Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
	qemu-arm, Shan Gavin, Paolo Bonzini, Marc-André Lureau

On Wed, 17 Mar 2021 at 10:59, Gavin Shan <gshan@redhat.com> wrote:
>
> Hi Peter,
>
> On 3/17/21 9:40 PM, Peter Maydell wrote:
> > On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
> >> On 3/17/21 8:09 PM, Peter Maydell wrote:
> >>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
> >>>>
> >>>>    static const VMStateDescription vmstate_pl011 = {
> >>>>        .name = "pl011",
> >>>>        .version_id = 2,
> >>>>        .minimum_version_id = 2,
> >>>> +    .post_load = pl011_post_load,
> >>>>        .fields = (VMStateField[]) {
> >>>>            VMSTATE_UINT32(readbuff, PL011State),
> >>>>            VMSTATE_UINT32(flags, PL011State),
> >>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
> >>>>            VMSTATE_INT32(read_trigger, PL011State),
> >>>>            VMSTATE_END_OF_LIST()
> >>>>        },
> >>>> -    .subsections = (const VMStateDescription * []) {
> >>>> -        &vmstate_pl011_clock,
> >>>> -        NULL
> >>>> -    }
> >>>>    };
> >>>
> >>> Doesn't dropping the subsection break migration compat ?
> >>>
> >>
> >> It's why this patch needs to be backported to stable branches.
> >> In that way, we won't have migration compatible issue.
> >
> > No, migration has to work from the existing already
> > shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
> > the correct "virt-5.2" &c versioned machine type.)
> >
>
> Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
> to v5.2.0. The migration failure happens during migration from v6.0
> to v5.1 with machine type as "virt-5.1", instead of migrating from
> v5.1 to v6.0. One question is if we need support backwards migration?

Upstream doesn't care about backwards migration. AIUI
RedHat as a downstream care about the backwards-migration
case in some specific situations, but I don't know if that
would include this one.

I misread the commit message of this patch and hadn't
realised that it was talking about a 5.2 -> 5.1 migration.
From upstream's point of view, commit aac63e0e6ea3 is fine
because it preserves forwards migration (5.1 -> 5.2).
If there's a change that makes RH-as-a-downstream's life
easier without breaking upstream's forward-compat requirements,
we can look at it: but unless I'm misreading this patch,
it would break 5.2 -> 6.0 migration, because 5.2 sends the
subsection and 6.0 with this patch would say "I don't know
how to deal with this subsection I've been sent".

> If we do support backwards migration, I think there are two options:
> (1) merge this patch and backport it to v5.2+; (2) Backport commit
> aac63e0e6ea3 to v5.2-. I guess (1) would be right way to go because
> it's less effort than (2). Besides, the clock needn't to be migrated
> if I'm correct.

You can't fix a backwards migration issue (if you care about it)
by backporting any patch to anywhere -- you need to deal with what
has already shipped.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17 11:14         ` Peter Maydell
@ 2021-03-17 12:54           ` Andrew Jones
  2021-03-17 13:09             ` Philippe Mathieu-Daudé
                               ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Andrew Jones @ 2021-03-17 12:54 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-arm, Shan Gavin, Paolo Bonzini,
	Gavin Shan

On Wed, Mar 17, 2021 at 11:14:56AM +0000, Peter Maydell wrote:
> On Wed, 17 Mar 2021 at 10:59, Gavin Shan <gshan@redhat.com> wrote:
> >
> > Hi Peter,
> >
> > On 3/17/21 9:40 PM, Peter Maydell wrote:
> > > On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
> > >> On 3/17/21 8:09 PM, Peter Maydell wrote:
> > >>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
> > >>>>
> > >>>>    static const VMStateDescription vmstate_pl011 = {
> > >>>>        .name = "pl011",
> > >>>>        .version_id = 2,
> > >>>>        .minimum_version_id = 2,
> > >>>> +    .post_load = pl011_post_load,
> > >>>>        .fields = (VMStateField[]) {
> > >>>>            VMSTATE_UINT32(readbuff, PL011State),
> > >>>>            VMSTATE_UINT32(flags, PL011State),
> > >>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
> > >>>>            VMSTATE_INT32(read_trigger, PL011State),
> > >>>>            VMSTATE_END_OF_LIST()
> > >>>>        },
> > >>>> -    .subsections = (const VMStateDescription * []) {
> > >>>> -        &vmstate_pl011_clock,
> > >>>> -        NULL
> > >>>> -    }
> > >>>>    };
> > >>>
> > >>> Doesn't dropping the subsection break migration compat ?
> > >>>
> > >>
> > >> It's why this patch needs to be backported to stable branches.
> > >> In that way, we won't have migration compatible issue.
> > >
> > > No, migration has to work from the existing already
> > > shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
> > > the correct "virt-5.2" &c versioned machine type.)
> > >
> >
> > Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
> > to v5.2.0. The migration failure happens during migration from v6.0
> > to v5.1 with machine type as "virt-5.1", instead of migrating from
> > v5.1 to v6.0. One question is if we need support backwards migration?
> 
> Upstream doesn't care about backwards migration. AIUI
> RedHat as a downstream care about the backwards-migration
> case in some specific situations, but I don't know if that
> would include this one.

Right, we do prefer to be able to support "ping-pong" migrations. For
example, if we start a virt-5.1 machine on a 5.1 build of QEMU, and then
migrate it to a 5.2 build of QEMU, we'd like to also be able to go back
to the 5.1 build.

I agree this patch is not the right approach. I think the right approach
is to introduce a compat property and make the "new" section dependent
on it. And then update the hw_compat_* arrays. Gavin, please take a look
at "Connecting subsections to properties" of docs/devel/migration.rst.

I'm also curious what the state of mach-virt's machine types are for
migration. It'd be nice to exhaustively test both forward migration of
all machine types and ping-pong migrations of all machine types. We can
then consider each issue we find (the pessimist in me suggests we'll find
more than this pl011 issue) and how/if we want to resolve them.

Thanks,
drew



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17 12:54           ` Andrew Jones
@ 2021-03-17 13:09             ` Philippe Mathieu-Daudé
  2021-03-17 13:22             ` Peter Maydell
  2021-03-18  2:34             ` Gavin Shan
  2 siblings, 0 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-03-17 13:09 UTC (permalink / raw)
  To: Andrew Jones, Peter Maydell
  Cc: Gavin Shan, Luc Michel, Beraldo Leal, QEMU Developers, qemu-arm,
	avocado-devel, Shan Gavin, Paolo Bonzini, Marc-André Lureau

+Beraldo

On 3/17/21 1:54 PM, Andrew Jones wrote:
> On Wed, Mar 17, 2021 at 11:14:56AM +0000, Peter Maydell wrote:
>> On Wed, 17 Mar 2021 at 10:59, Gavin Shan <gshan@redhat.com> wrote:
>>>
>>> Hi Peter,
>>>
>>> On 3/17/21 9:40 PM, Peter Maydell wrote:
>>>> On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
>>>>> On 3/17/21 8:09 PM, Peter Maydell wrote:
>>>>>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>>>>>>>
>>>>>>>    static const VMStateDescription vmstate_pl011 = {
>>>>>>>        .name = "pl011",
>>>>>>>        .version_id = 2,
>>>>>>>        .minimum_version_id = 2,
>>>>>>> +    .post_load = pl011_post_load,
>>>>>>>        .fields = (VMStateField[]) {
>>>>>>>            VMSTATE_UINT32(readbuff, PL011State),
>>>>>>>            VMSTATE_UINT32(flags, PL011State),
>>>>>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>>>>>>>            VMSTATE_INT32(read_trigger, PL011State),
>>>>>>>            VMSTATE_END_OF_LIST()
>>>>>>>        },
>>>>>>> -    .subsections = (const VMStateDescription * []) {
>>>>>>> -        &vmstate_pl011_clock,
>>>>>>> -        NULL
>>>>>>> -    }
>>>>>>>    };
>>>>>>
>>>>>> Doesn't dropping the subsection break migration compat ?
>>>>>>
>>>>>
>>>>> It's why this patch needs to be backported to stable branches.
>>>>> In that way, we won't have migration compatible issue.
>>>>
>>>> No, migration has to work from the existing already
>>>> shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
>>>> the correct "virt-5.2" &c versioned machine type.)
>>>>
>>>
>>> Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
>>> to v5.2.0. The migration failure happens during migration from v6.0
>>> to v5.1 with machine type as "virt-5.1", instead of migrating from
>>> v5.1 to v6.0. One question is if we need support backwards migration?
>>
>> Upstream doesn't care about backwards migration. AIUI
>> RedHat as a downstream care about the backwards-migration
>> case in some specific situations, but I don't know if that
>> would include this one.
> 
> Right, we do prefer to be able to support "ping-pong" migrations. For
> example, if we start a virt-5.1 machine on a 5.1 build of QEMU, and then
> migrate it to a 5.2 build of QEMU, we'd like to also be able to go back
> to the 5.1 build.
> 
> I agree this patch is not the right approach. I think the right approach
> is to introduce a compat property and make the "new" section dependent
> on it. And then update the hw_compat_* arrays. Gavin, please take a look
> at "Connecting subsections to properties" of docs/devel/migration.rst.
> 
> I'm also curious what the state of mach-virt's machine types are for
> migration. It'd be nice to exhaustively test both forward migration of
> all machine types and ping-pong migrations of all machine types.

FYI this test has been suggested to the Avocado team few times.
They might already have a ticket to track any progress.

> We can
> then consider each issue we find (the pessimist in me suggests we'll find
> more than this pl011 issue) and how/if we want to resolve them.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17 12:54           ` Andrew Jones
  2021-03-17 13:09             ` Philippe Mathieu-Daudé
@ 2021-03-17 13:22             ` Peter Maydell
  2021-03-18  2:34             ` Gavin Shan
  2 siblings, 0 replies; 10+ messages in thread
From: Peter Maydell @ 2021-03-17 13:22 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
	Marc-André Lureau, qemu-arm, Shan Gavin, Paolo Bonzini,
	Gavin Shan

On Wed, 17 Mar 2021 at 12:55, Andrew Jones <drjones@redhat.com> wrote:
> I'm also curious what the state of mach-virt's machine types are for
> migration.

Probably not great -- I don't think anybody is really testing
cross-version migration, and I don't think there's a great
deal of in-practice use of it for Arm either. (See also the
issue with accidentally having env->features in the CPU
migration data, which broke cross-version migration: that
was around a while and we only got one user complaint about it.)

Unless we have a serious test suite for this kind of thing
upstream it's just going to continue to be broken, because at
the moment all we have is "people make best-efforts to think
about migration compat when coding and reviewing, but don't
actually test".

thanks
-- PMM


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] hw/char/pl011: Fix clock migration failure
  2021-03-17 12:54           ` Andrew Jones
  2021-03-17 13:09             ` Philippe Mathieu-Daudé
  2021-03-17 13:22             ` Peter Maydell
@ 2021-03-18  2:34             ` Gavin Shan
  2 siblings, 0 replies; 10+ messages in thread
From: Gavin Shan @ 2021-03-18  2:34 UTC (permalink / raw)
  To: Andrew Jones, Peter Maydell
  Cc: Luc Michel, QEMU Developers, Philippe Mathieu-Daudé,
	qemu-arm, Shan Gavin, Marc-André Lureau, Paolo Bonzini

Hi Drew,

On 3/17/21 11:54 PM, Andrew Jones wrote:
> On Wed, Mar 17, 2021 at 11:14:56AM +0000, Peter Maydell wrote:
>> On Wed, 17 Mar 2021 at 10:59, Gavin Shan <gshan@redhat.com> wrote:
>>> On 3/17/21 9:40 PM, Peter Maydell wrote:
>>>> On Wed, 17 Mar 2021 at 10:37, Gavin Shan <gshan@redhat.com> wrote:
>>>>> On 3/17/21 8:09 PM, Peter Maydell wrote:
>>>>>> On Wed, 17 Mar 2021 at 04:44, Gavin Shan <gshan@redhat.com> wrote:
>>>>>>>
>>>>>>>     static const VMStateDescription vmstate_pl011 = {
>>>>>>>         .name = "pl011",
>>>>>>>         .version_id = 2,
>>>>>>>         .minimum_version_id = 2,
>>>>>>> +    .post_load = pl011_post_load,
>>>>>>>         .fields = (VMStateField[]) {
>>>>>>>             VMSTATE_UINT32(readbuff, PL011State),
>>>>>>>             VMSTATE_UINT32(flags, PL011State),
>>>>>>> @@ -355,10 +355,6 @@ static const VMStateDescription vmstate_pl011 = {
>>>>>>>             VMSTATE_INT32(read_trigger, PL011State),
>>>>>>>             VMSTATE_END_OF_LIST()
>>>>>>>         },
>>>>>>> -    .subsections = (const VMStateDescription * []) {
>>>>>>> -        &vmstate_pl011_clock,
>>>>>>> -        NULL
>>>>>>> -    }
>>>>>>>     };
>>>>>>
>>>>>> Doesn't dropping the subsection break migration compat ?
>>>>>>
>>>>>
>>>>> It's why this patch needs to be backported to stable branches.
>>>>> In that way, we won't have migration compatible issue.
>>>>
>>>> No, migration has to work from the existing already
>>>> shipped 5.1, 5.2, etc releases to 6.0 (assuming you use
>>>> the correct "virt-5.2" &c versioned machine type.)
>>>>
>>>
>>> Commit aac63e0e6ea3 ("hw/char/pl011: add a clock input") is merged
>>> to v5.2.0. The migration failure happens during migration from v6.0
>>> to v5.1 with machine type as "virt-5.1", instead of migrating from
>>> v5.1 to v6.0. One question is if we need support backwards migration?
>>
>> Upstream doesn't care about backwards migration. AIUI
>> RedHat as a downstream care about the backwards-migration
>> case in some specific situations, but I don't know if that
>> would include this one.
> 
> Right, we do prefer to be able to support "ping-pong" migrations. For
> example, if we start a virt-5.1 machine on a 5.1 build of QEMU, and then
> migrate it to a 5.2 build of QEMU, we'd like to also be able to go back
> to the 5.1 build.
> 
> I agree this patch is not the right approach. I think the right approach
> is to introduce a compat property and make the "new" section dependent
> on it. And then update the hw_compat_* arrays. Gavin, please take a look
> at "Connecting subsections to properties" of docs/devel/migration.rst.
> 

Agree and thanks for the pointer. I will post another patch to have
something in hw_compat_5_1 to address this particular issue.

> I'm also curious what the state of mach-virt's machine types are for
> migration. It'd be nice to exhaustively test both forward migration of
> all machine types and ping-pong migrations of all machine types. We can
> then consider each issue we find (the pessimist in me suggests we'll find
> more than this pl011 issue) and how/if we want to resolve them.
> 

Yeah, I will think about it and do the testing to see if there are more
issues. Also, it'd better to be integrated to existing testing framework
as you suggested.

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-03-18  2:35 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-17  4:44 [PATCH] hw/char/pl011: Fix clock migration failure Gavin Shan
2021-03-17  9:09 ` Peter Maydell
2021-03-17 10:37   ` Gavin Shan
2021-03-17 10:40     ` Peter Maydell
2021-03-17 10:59       ` Gavin Shan
2021-03-17 11:14         ` Peter Maydell
2021-03-17 12:54           ` Andrew Jones
2021-03-17 13:09             ` Philippe Mathieu-Daudé
2021-03-17 13:22             ` Peter Maydell
2021-03-18  2:34             ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).