From mboxrd@z Thu Jan 1 00:00:00 1970 Received: by 2002:a17:906:cf90:b0:9bd:85f7:2662 with SMTP id um16csp1065741ejb; Fri, 20 Oct 2023 01:34:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFlVP1XMdvDZCanF1gFVgcNMBS4+jvZQS6RjlWv7xaHK7i/xboVxDIkw5bsyDWUkBqBrwj9 X-Received: by 2002:a67:ae09:0:b0:44e:99a2:a42 with SMTP id x9-20020a67ae09000000b0044e99a20a42mr1130577vse.11.1697790870003; Fri, 20 Oct 2023 01:34:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697790869; cv=none; d=google.com; s=arc-20160816; b=PJpUe683NmQ4bkiAq0bR9c2bYcZ/+dN1TJlhz9cDtneb52uTVh43ECTqBHVdMK+HIK UwIpadPH1E4UN3nmQw+07NYKPxNHkhqGFJBCAI4cxVCwXuMylyosluRRq0sAqioK8zfl nrYMuuIWB69AG9mJ2u6TujkwjSJrPb4OYFls5Z+b0xEmQNJP/NLDjQ1qZvBCOw8X0Kbb R1fYf1tIQSVRN5XF1dmJzk8/MmtsSnc1hm7HUVFYieg25bqHDr/3bD3GNXCkGkmViQYG U0kE01tBTSwgCN82rU9Yz69MsrglYG2iD6v7/myNKlpjL+pGsM2fIGK+GN8ssjYjd2hD 5FRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:subject:cc:to:from :date; bh=gx3+j+3wQ1sNL240zG6R0O3hzI3ML9wx0AKl7NfgxDM=; fh=M3IvKgODSulzaXEeVIBhJ0uBPVdt+uw3g7cDCuSSDhw=; b=NsVNrNJDncbi0r86YO1BtZHcQu5YaTCtLK+IqApTQGJrrkhAk5lQ5ihobPDF+CXUgc MeRxPO6voDpBM9yRMibcqHuRrn+MHxP+f29SIyBbtVcV6xziWap6GWknhUG1VaIuWhkL s8uV2aRVnTm9yXieO7I4bwsxZSh/OIkb2KbpXGTHK5eTwoi/hZPa7CqxC2PsBsRCsCmR /0tZmTKMUwF1eBsoEQOgPrdMYpJma3zzbsOpmMuemHjMoJdSvkZM98MHTne7ZbqfxBj2 TeMOtxV/PtMYbUqzbg63qjBX9ST1iNi0eryUXsqug85TH0xDR4UQ2bHJOciTyYJeHuxh nFEQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org" Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id l10-20020ad4444a000000b0066d1b390dfasi1008278qvt.345.2023.10.20.01.34.29 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 20 Oct 2023 01:34:29 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; spf=pass (google.com: domain of qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org" Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qtkxG-00006L-QO; Fri, 20 Oct 2023 04:34:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtkxB-0008WC-HK; Fri, 20 Oct 2023 04:33:59 -0400 Received: from smtpout1.mo529.mail-out.ovh.net ([178.32.125.2]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtkx6-0002hc-W4; Fri, 20 Oct 2023 04:33:57 -0400 Received: from mxplan5.mail.ovh.net (unknown [10.109.146.249]) by mo529.mail-out.ovh.net (Postfix) with ESMTPS id B1A93208A9; Fri, 20 Oct 2023 08:33:47 +0000 (UTC) Received: from kaod.org (37.59.142.110) by DAG6EX1.mxp5.local (172.16.2.51) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 20 Oct 2023 10:33:45 +0200 Authentication-Results: garm.ovh; auth=pass (GARM-110S004a9907de5-3340-4494-9c59-5172bd8e49b7, C3DE39B25D4A09A1EA5644DCEB8289D547B4AEB9) smtp.auth=groug@kaod.org X-OVh-ClientIp: 88.179.9.154 Date: Fri, 20 Oct 2023 10:33:44 +0200 From: Greg Kurz To: Nicholas Piggin CC: Juan Quintela , , "Stefan Berger" , Marcel Apfelbaum , , , Gerd Hoffmann , Corey Minyard , Samuel Thibault , Richard Henderson , David Hildenbrand , "Ilya Leoshkevich" , Fabiano Rosas , Eric Farman , Peter Xu , Harsh Prateek Bora , John Snow , , Mark Cave-Ayland , Christian Borntraeger , =?UTF-8?B?TWFyYy1BbmRyw6k=?= Lureau , "Stefan Weil" , , Jason Wang , Corey Minyard , Leonardo Bras , Thomas Huth , Peter Maydell , "Michael S. Tsirkin" , =?UTF-8?B?Q8OpZHJpYw==?= Le Goater , David Gibson , Halil Pasic , "Daniel Henrique Barboza" Subject: Re: [PATCH 07/13] RFC migration: icp/server is a mess Message-ID: <20231020103344.34baea63@bahia> In-Reply-To: References: <20231019190831.20363-1-quintela@redhat.com> <20231019190831.20363-8-quintela@redhat.com> <20231019233958.17abb488@bahia> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [37.59.142.110] X-ClientProxiedBy: DAG1EX2.mxp5.local (172.16.2.2) To DAG6EX1.mxp5.local (172.16.2.51) X-Ovh-Tracer-GUID: 1055eb79-0a0f-4153-a7d7-80632a3c3abc X-Ovh-Tracer-Id: 4281515872457365879 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvkedrjeekgddtgecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjpdevjffgvefmvefgnecuuegrihhlohhuthemucehtddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpeffhffvvefukfgjfhfogggtgfhisehtjeertdertddvnecuhfhrohhmpefirhgvghcumfhurhiiuceoghhrohhugheskhgrohgurdhorhhgqeenucggtffrrghtthgvrhhnpeegkeejtdevgeekieelffdvtedvvdegtdduudeigffhhffgvdfhgeejteekheefkeenucfkphepuddvjedrtddrtddruddpfeejrdehledrudegvddruddutddpkeekrddujeelrdelrdduheegnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepuddvjedrtddrtddruddpmhgrihhlfhhrohhmpeeoghhrohhugheskhgrohgurdhorhhgqedpnhgspghrtghpthhtohepuddprhgtphhtthhopehnphhighhgihhnsehgmhgrihhlrdgtohhmpdgurghvihgusehgihgsshhonhdrughrohhpsggvrghrrdhiugdrrghupdhmshhtsehrvgguhhgrthdrtghomhdpphgvthgvrhdrmhgrhiguvghllheslhhinhgrrhhordhorhhgpdhthhhuthhhsehrvgguhhgrthdrtghomhdplhgvohgsrhgrshesrhgvughhrghtrdgtohhmpdhmihhnhigrrhgusegrtghmrdhorhhgpdhjrghsohifrghnghesrhgvughhrghtrdgtohhmpdhqvghmuhdqrghrmhesnhhonhhgnh hurdhorhhgpdhsfiesfigvihhlnhgvthiirdguvgdpmhgrrhgtrghnughrvgdrlhhurhgvrghusehrvgguhhgrthdrtghomhdpsghorhhnthhrrggvghgvrheslhhinhhugidrihgsmhdrtghomhdpmhgrrhhkrdgtrghvvgdqrgihlhgrnhgusehilhgrnhguvgdrtghordhukhdpqhgvmhhuqdgslhhotghksehnohhnghhnuhdrohhrghdpjhhsnhhofiesrhgvughhrghtrdgtohhmpdhprghsihgtsehlihhnuhigrdhisghmrdgtohhmpdhhrghrshhhphgssehlihhnuhigrdhisghmrdgtohhmpdhfrghrmhgrnheslhhinhhugidrihgsmhdrtghomhdpfhgrrhhoshgrshesshhushgvrdguvgdpihhiiheslhhinhhugidrihgsmhdrtghomhdpuggrvhhiugesrhgvughhrghtrdgtohhmpdhrihgthhgrrhgurdhhvghnuggvrhhsohhnsehlihhnrghrohdrohhrghdpshgrmhhuvghlrdhthhhisggruhhlthesvghnshdqlhihohhnrdhorhhgpdgtmhhinhihrghrugesmhhvihhsthgrrdgtohhmpdhkrhgrgigvlhesrhgvughhrghtrdgtohhmpdhqvghmuhdqshefledtgiesnhhonhhgnhhurdhorhhgpdhqvghmuhdqphhptgesnhhonhhgnhhurdhorhhgpdhmrghrtggvlhdrrghpfhgvlhgsrghumhesghhmrghilhdrtghomhdpshhtvghfrghnsgeslhhinhhugidrvhhnvghtrdhisghmrdgtohhmpdhqvghmuhdquggvvhgvlhesnhhonhhgnhhurdhorhhgpdhquhhinhhtvghlrgesrhgvughhrghtrdgtohhmpdhpvghtvghrgie srhgvughhrghtrdgtohhmpdgurghnihgvlhhhsgegudefsehgmhgrihhlrdgtohhmpdfovfetjfhoshhtpehmohehvdelpdhmohguvgepshhmthhpohhuth Received-SPF: pass client-ip=178.32.125.2; envelope-from=groug@kaod.org; helo=smtpout1.mo529.mail-out.ovh.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-arm@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org Sender: qemu-arm-bounces+alex.bennee=linaro.org@nongnu.org X-TUID: 8FjRxFZYbhjy On Fri, 20 Oct 2023 17:49:38 +1000 "Nicholas Piggin" wrote: > On Fri Oct 20, 2023 at 7:39 AM AEST, Greg Kurz wrote: > > On Thu, 19 Oct 2023 21:08:25 +0200 > > Juan Quintela wrote: > > > > > Current code does: > > > - register pre_2_10_vmstate_dummy_icp with "icp/server" and instance > > > dependinfg on cpu number > > > - for newer machines, it register vmstate_icp with "icp/server" name > > > and instance 0 > > > - now it unregisters "icp/server" for the 1st instance. > > > > > > This is wrong at many levels: > > > - we shouldn't have two VMSTATEDescriptions with the same name > > > - In case this is the only solution that we can came with, it needs to > > > be: > > > * register pre_2_10_vmstate_dummy_icp > > > * unregister pre_2_10_vmstate_dummy_icp > > > * register real vmstate_icp > > > > > > As the initialization of this machine is already complex enough, I > > > need help from PPC maintainers to fix this. > > > > > > Volunteers? > > > > > > CC: Cedric Le Goater > > > CC: Daniel Henrique Barboza > > > CC: David Gibson > > > CC: Greg Kurz > > > > > > Signed-off-by: Juan Quintela > > > --- > > > hw/ppc/spapr.c | 7 ++++++- > > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > > index cb840676d3..8531d13492 100644 > > > --- a/hw/ppc/spapr.c > > > +++ b/hw/ppc/spapr.c > > > @@ -143,7 +143,12 @@ static bool pre_2_10_vmstate_dummy_icp_needed(void *opaque) > > > } > > > > > > static const VMStateDescription pre_2_10_vmstate_dummy_icp = { > > > - .name = "icp/server", > > > + /* > > > + * Hack ahead. We can't have two devices with the same name and > > > + * instance id. So I rename this to pass make check. > > > + * Real help from people who knows the hardware is needed. > > > + */ > > > + .name = "pre-2.10-icp/server", > > > .version_id = 1, > > > .minimum_version_id = 1, > > > .needed = pre_2_10_vmstate_dummy_icp_needed, > > > > I guess this fix is acceptable as well and a lot simpler than > > reverting the hack actually. Outcome is the same : drop > > compat with pseries-2.9 and older. > > > > Reviewed-by: Greg Kurz > > So the reason we can't have duplicate names registered, aside from it > surely going bad if we actually send or receive a stream at the point > they are registered, is the duplcate check introduced in patch 9? But > before that, this hack does seem to actually work because the duplicate > is unregistered right away. > Correct. > If I understand the workaround, there is an asymmetry in the migration > sequence in that receiving an unexpected object would cause a failure, > but going from newer to older would just skip some "expected" objects > and that didn't cause a problem. So you only have to deal with ignoring > the unexpected ones going form older to newer. > Correct. > Side question, is it possible to flag the problem of *not* receiving > an object that you did expect? That might be a source of bugs too. > AFAICR we try to only migrate state that differs from reset : the destination cannot really assume it will receive anything for a given device. > Anyway, I wonder if we could fix this spapr problem by adding a special > case wild card instance matcher to ignore it? It's still a bit hacky > but maybe a bit nicer. I don't mind deprecating the machine soon if > you want to clear the wildcard hack away soon, but it would be nice to > separate the deprecation and removal from the fix, if possible. > > This patch is not tested but hopefully helps illustrate the idea. > I'm not sure this will fly with older QEMUs that don't know about VMSTATE_INSTANCE_ID_WILD... but I'll let Juan comment on that. > Thanks, > Nick > Cheers, -- Greg > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h > index 1a31fb7293..8ce03edefa 100644 > --- a/include/migration/vmstate.h > +++ b/include/migration/vmstate.h > @@ -1205,6 +1205,7 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, > bool vmstate_save_needed(const VMStateDescription *vmsd, void *opaque); > > #define VMSTATE_INSTANCE_ID_ANY -1 > +#define VMSTATE_INSTANCE_ID_WILD -2 > > /* Returns: 0 on success, -1 on failure */ > int vmstate_register_with_alias_id(VMStateIf *obj, uint32_t instance_id, > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index cb840676d3..2418899dd4 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -155,16 +155,10 @@ static const VMStateDescription pre_2_10_vmstate_dummy_icp = { > }, > }; > > -static void pre_2_10_vmstate_register_dummy_icp(int i) > +static void pre_2_10_vmstate_register_dummy_icp(void) > { > - vmstate_register(NULL, i, &pre_2_10_vmstate_dummy_icp, > - (void *)(uintptr_t) i); > -} > - > -static void pre_2_10_vmstate_unregister_dummy_icp(int i) > -{ > - vmstate_unregister(NULL, &pre_2_10_vmstate_dummy_icp, > - (void *)(uintptr_t) i); > + vmstate_register(NULL, VMSTATE_INSTANCE_ID_WILD, > + &pre_2_10_vmstate_dummy_icp, NULL); > } > > int spapr_max_server_number(SpaprMachineState *spapr) > @@ -2665,12 +2659,10 @@ static void spapr_init_cpus(SpaprMachineState *spapr) > } > > if (smc->pre_2_10_has_unused_icps) { > - for (i = 0; i < spapr_max_server_number(spapr); i++) { > - /* Dummy entries get deregistered when real ICPState objects > - * are registered during CPU core hotplug. > - */ > - pre_2_10_vmstate_register_dummy_icp(i); > - } > + /* Dummy entries get deregistered when real ICPState objects > + * are registered during CPU core hotplug. > + */ > + pre_2_10_vmstate_register_dummy_icp(); > } > > for (i = 0; i < possible_cpus->len; i++) { > @@ -3873,21 +3865,9 @@ void spapr_core_release(DeviceState *dev) > static void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev) > { > MachineState *ms = MACHINE(hotplug_dev); > - SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(ms); > CPUCore *cc = CPU_CORE(dev); > CPUArchId *core_slot = spapr_find_cpu_slot(ms, cc->core_id, NULL); > > - if (smc->pre_2_10_has_unused_icps) { > - SpaprCpuCore *sc = SPAPR_CPU_CORE(OBJECT(dev)); > - int i; > - > - for (i = 0; i < cc->nr_threads; i++) { > - CPUState *cs = CPU(sc->threads[i]); > - > - pre_2_10_vmstate_register_dummy_icp(cs->cpu_index); > - } > - } > - > assert(core_slot); > core_slot->cpu = NULL; > qdev_unrealize(dev); > @@ -3968,10 +3948,8 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev) > { > SpaprMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); > MachineClass *mc = MACHINE_GET_CLASS(spapr); > - SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc); > SpaprCpuCore *core = SPAPR_CPU_CORE(OBJECT(dev)); > CPUCore *cc = CPU_CORE(dev); > - CPUState *cs; > SpaprDrc *drc; > CPUArchId *core_slot; > int index; > @@ -4018,13 +3996,6 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev) > &error_abort); > } > } > - > - if (smc->pre_2_10_has_unused_icps) { > - for (i = 0; i < cc->nr_threads; i++) { > - cs = CPU(core->threads[i]); > - pre_2_10_vmstate_unregister_dummy_icp(cs->cpu_index); > - } > - } > } > > static void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > diff --git a/migration/savevm.c b/migration/savevm.c > index 497ce02bd7..f33449e208 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -989,6 +989,10 @@ static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc) > trace_savevm_section_skip(se->idstr, se->section_id); > return 0; > } > + if (se->instance_id == VMSTATE_INSTANCE_ID_WILD) { > + warn_report("Wildcard vmstate entry must set needed=false"); > + return 0; > + } > > trace_savevm_section_start(se->idstr, se->section_id); > save_section_header(f, se, QEMU_VM_SECTION_FULL); > @@ -1731,13 +1735,16 @@ int qemu_save_device_state(QEMUFile *f) > > static SaveStateEntry *find_se(const char *idstr, uint32_t instance_id) > { > + SaveStateEntry *se_wild = NULL; > SaveStateEntry *se; > > QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { > - if (!strcmp(se->idstr, idstr) && > - (instance_id == se->instance_id || > - instance_id == se->alias_id)) > - return se; > + if (!strcmp(se->idstr, idstr)) { > + if (instance_id == se->instance_id || instance_id == se->alias_id) > + return se; > + if (se->instance_id == VMSTATE_INSTANCE_ID_WILD) > + se_wild = se; > + } > /* Migrating from an older version? */ > if (strstr(se->idstr, idstr) && se->compat) { > if (!strcmp(se->compat->idstr, idstr) && > @@ -1746,7 +1753,7 @@ static SaveStateEntry *find_se(const char *idstr, uint32_t instance_id) > return se; > } > } > - return NULL; > + return se_wild; > } > > enum LoadVMExitCodes { -- Greg