From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:43547)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jjherne@linux.vnet.ibm.com>) id 1Um5Hi-0004FG-Jx
	for qemu-devel@nongnu.org; Mon, 10 Jun 2013 12:49:37 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jjherne@linux.vnet.ibm.com>) id 1Um5Hg-0006Mo-0u
	for qemu-devel@nongnu.org; Mon, 10 Jun 2013 12:49:34 -0400
Received: from e7.ny.us.ibm.com ([32.97.182.137]:52261)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jjherne@linux.vnet.ibm.com>) id 1Um5Hf-0006Mi-Rg
	for qemu-devel@nongnu.org; Mon, 10 Jun 2013 12:49:31 -0400
Received: from /spool/local
	by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only!
	Violators will be prosecuted
	for <qemu-devel@nongnu.org> from <jjherne@linux.vnet.ibm.com>;
	Mon, 10 Jun 2013 12:49:31 -0400
Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234])
	by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 641C3C9005E
	for <qemu-devel@nongnu.org>; Mon, 10 Jun 2013 12:49:25 -0400 (EDT)
Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217])
	by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	r5AGnPfk224066
	for <qemu-devel@nongnu.org>; Mon, 10 Jun 2013 12:49:25 -0400
Received: from d01av03.pok.ibm.com (loopback [127.0.0.1])
	by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id
	r5AGnNHb031266
	for <qemu-devel@nongnu.org>; Mon, 10 Jun 2013 13:49:24 -0300
Message-ID: <51B60392.8080803@linux.vnet.ibm.com>
Date: Mon, 10 Jun 2013 12:49:22 -0400
From: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
MIME-Version: 1.0
References: <1370626087-840-1-git-send-email-jjherne@us.ibm.com>
	<1370626087-840-9-git-send-email-jjherne@us.ibm.com>
	<51B3D647.6090400@suse.de>
In-Reply-To: <51B3D647.6090400@suse.de>
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] [PATCH 8/8] [PATCH RFC v2] s390-qemu: cpu hotplug
 - Treat S390 cpus as devices
Reply-To: jjherne@linux.vnet.ibm.com
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: =?ISO-8859-15?Q?Andreas_F=E4rber?= <afaerber@suse.de>
Cc: ehabkost@redhat.com, qemu-devel@nongnu.org, agraf@suse.de, borntraeger@de.ibm.com, jfrei@linux.vnet.ibm.com, Cornelia Huck <cornelia.huck@de.ibm.com>, imammedo@redhat.com, "Jason J. Herne" <jjherne@us.ibm.com>

On 06/08/2013 09:11 PM, Andreas Färber wrote:
> Am 07.06.2013 19:28, schrieb Jason J. Herne:
>> From: "Jason J. Herne" <jjherne@us.ibm.com>
>>
>> Modify cpu initialization and QOM routines associated with s390-cpu such that
>> all cpus on S390 are now created via the QOM device creation code path.
>>
>> Signed-off-by: Jason J. Herne <jjherne@us.ibm.com>
>> ---
>>   hw/s390x/s390-virtio-ccw.c |   15 ++++++++++-----
>>   hw/s390x/s390-virtio.c     |   25 +++++--------------------
>>   hw/s390x/s390-virtio.h     |    2 +-
>>   include/qapi/qmp/qerror.h  |    3 +++
>>   qdev-monitor.c             |   17 +++++++++++++++++
>>   target-s390x/cpu.c         |   24 ++++++++++++++++++++++--
>>   6 files changed, 58 insertions(+), 28 deletions(-)
>>
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index 70bd858..141adce 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -95,12 +95,8 @@ static void ccw_init(QEMUMachineInitArgs *args)
>>       /* allocate storage keys */
>>       s390_set_storage_keys_p(g_malloc0(my_ram_size / TARGET_PAGE_SIZE));
>>
>> -    /* init CPUs */
>> -    s390_init_cpus(args->cpu_model);
>> +    s390_init_ipi_states();
>>
>> -    if (kvm_enabled()) {
>> -        kvm_s390_enable_css_support(s390_cpu_addr2state(0));
>> -    }
>>       /*
>>        * Create virtual css and set it as default so that non mcss-e
>>        * enabled guests only see virtio devices.
>> @@ -112,11 +108,20 @@ static void ccw_init(QEMUMachineInitArgs *args)
>>       s390_create_virtio_net(BUS(css_bus), "virtio-net-ccw");
>>   }
>>
>> +static void ccw_post_cpu_init(void)
>> +{
>> +    if (kvm_enabled()) {
>> +        kvm_s390_enable_css_support(s390_cpu_addr2state(0));
>> +    }
>> +}
>
> Am I understanding correctly that all this is about differentiating one
> call between the ccw and legacy machines?
>

If you are referring to the post_cpu_init hook that I've added in a 
previous patch, then yes :). I get into more detail in a reply to the 
relevant patch but essentially kvm_s390_enable_css_support() currently 
depends on cpus being initialized.  My patch series moves cpu 
initialization out of machine->init() and places it in main().  This is 
done to allow cpus to be treated more like QOM devices which are all 
created in main().

> Isn't there a machine-init-done Notifier that the ccw machine init could
> register for?
>

If there is I am not aware of it.  I will investigate.

> What if CPU 0 were hot-unplugged? Would the capability need to be
> re-enabled or will this remain a one-time task?
>

Several places in S390 code refer to cpu 0 specifically.  When unplug is 
implemented we will need to check for and fail an attempt to unplug cpu 
0.  Else: We'll need to remove all dependencies on cpu-0.

>> +
>>   static QEMUMachine ccw_machine = {
>>       .name = "s390-ccw-virtio",
>>       .alias = "s390-ccw",
>>       .desc = "VirtIO-ccw based S390 machine",
>> +    .cpu_device_str = "s390-cpu",
>
> TYPE_S390_CPU would be safer than hardcoding, if we need to do this.
>
>>       .init = ccw_init,
>> +    .post_cpu_init = ccw_post_cpu_init,
>>       .block_default_type = IF_VIRTIO,
>>       .no_cdrom = 1,
>>       .no_floppy = 1,
>> diff --git a/hw/s390x/s390-virtio.c b/hw/s390x/s390-virtio.c
>> index 4af2d86..069a187 100644
>> --- a/hw/s390x/s390-virtio.c
>> +++ b/hw/s390x/s390-virtio.c
>> @@ -201,31 +201,17 @@ void s390_init_ipl_dev(const char *kernel_filename,
>>       qdev_init_nofail(dev);
>>   }
>>
>> -void s390_init_cpus(const char *cpu_model)
>> +void s390_init_ipi_states(void)
>>   {
>>       int i;
>>
>> -    if (cpu_model == NULL) {
>> -        cpu_model = "host";
>> -    }
>> -
>> -    ipi_states = g_malloc(sizeof(S390CPU *) * smp_cpus);
>> -
>> -    for (i = 0; i < smp_cpus; i++) {
>> -        S390CPU *cpu;
>> -        CPUState *cs;
>> +    ipi_states = g_malloc(sizeof(S390CPU *) * max_cpus);
>>
>> -        cpu = cpu_s390x_init(cpu_model);
>> -        cs = CPU(cpu);
>> -
>> -        ipi_states[i] = cpu;
>> -        cs->halted = 1;
>> -        cpu->env.exception_index = EXCP_HLT;
>> -        cpu->env.storage_keys = s390_get_storage_keys_p();
>> +    for (i = 0; i < max_cpus; i++) {
>> +        ipi_states[i] = NULL;
>>       }
>>   }
>>
>> -
>
> Whitespace change intentional?
>
>>   void s390_create_virtio_net(BusState *bus, const char *name)
>>   {
>>       int i;
>> @@ -296,8 +282,7 @@ static void s390_init(QEMUMachineInitArgs *args)
>>       /* allocate storage keys */
>>       s390_set_storage_keys_p(g_malloc0(my_ram_size / TARGET_PAGE_SIZE));
>>
>> -    /* init CPUs */
>> -    s390_init_cpus(args->cpu_model);
>> +    s390_init_ipi_states();
>>
>>       /* Create VirtIO network adapters */
>>       s390_create_virtio_net((BusState *)s390_bus, "virtio-net-s390");
>
> So effectively you're ripping out support for -cpu arguments and
> assuming that s390-cpu will stay the only available type - when we were
> actually just waiting for you guys to sort out how you want your models
> to be named, which I believe you wanted to coordinate with Linux?

Removing model support was not my intention.  I'll discuss this with my 
team and ensure we're all on the same page and headed down the right 
path with respect to models.

>
> I still don't understand why you want to deviate from all other
> architectures here. -smp N is supposed to create N times -cpu, not N
> times QEMUMachine::cpu_device_str.
>

It is not my intention to deviate from other architectures. 
cpu_device_str is used so we can detect when the user has given us a cpu 
device on the command line.  We convert -smp to -device statements using 
cpu_device_str so we can handle cpus a single way instead of two ways. 
Plus it greatly simplifies counting.  I'll investigate this situation 
further.


>> diff --git a/hw/s390x/s390-virtio.h b/hw/s390x/s390-virtio.h
>> index c1cb042..7b1ef9f 100644
>> --- a/hw/s390x/s390-virtio.h
>> +++ b/hw/s390x/s390-virtio.h
>> @@ -20,7 +20,7 @@
>>   typedef int (*s390_virtio_fn)(const uint64_t *args);
>>   void s390_register_virtio_hypercall(uint64_t code, s390_virtio_fn fn);
>>
>> -void s390_init_cpus(const char *cpu_model);
>> +void s390_init_ipi_states(void);
>>   void s390_init_ipl_dev(const char *kernel_filename,
>>                          const char *kernel_cmdline,
>>                          const char *initrd_filename,
>> diff --git a/include/qapi/qmp/qerror.h b/include/qapi/qmp/qerror.h
>> index 6c0a18d..6627dc4 100644
>> --- a/include/qapi/qmp/qerror.h
>> +++ b/include/qapi/qmp/qerror.h
>> @@ -162,6 +162,9 @@ void assert_no_error(Error *err);
>>   #define QERR_KVM_MISSING_CAP \
>>       ERROR_CLASS_K_V_M_MISSING_CAP, "Using KVM without %s, %s unavailable"
>>
>> +#define QERR_MAX_CPUS \
>> +    ERROR_CLASS_GENERIC_ERROR, "The maximum number of cpus has already been created for this guest"
>
> Don't add QERR_* constants, use error_setg().
>

Will change.

>> +
>>   #define QERR_MIGRATION_ACTIVE \
>>       ERROR_CLASS_GENERIC_ERROR, "There's a migration process in progress"
>>
>> diff --git a/qdev-monitor.c b/qdev-monitor.c
>> index e54dbc2..a4adeb8 100644
>> --- a/qdev-monitor.c
>> +++ b/qdev-monitor.c
>> @@ -23,6 +23,9 @@
>>   #include "monitor/qdev.h"
>>   #include "qmp-commands.h"
>>   #include "sysemu/arch_init.h"
>> +#include "sysemu/sysemu.h"
>> +#include "hw/boards.h"
>> +#include "sysemu/cpus.h"
>>   #include "qemu/config-file.h"
>>
>>   /*
>> @@ -442,6 +445,14 @@ DeviceState *qdev_device_add(QemuOpts *opts)
>>           return NULL;
>>       }
>>
>> +    if (driver && current_machine &&
>> +        strcmp(driver, current_machine->cpu_device_str) == 0) {
>> +        if (smp_cpus == max_cpus) {
>> +            qerror_report(QERR_MAX_CPUS);
>
> As mentioned on 7/8, this should best be handled on QOM realize level.
> That way we get the check consistently and can pass on the error.
>
Yep.

> Also this hunk seems misplaced in this patch, not being s390-only code.
>
Agree.  I'll do a better job organizing the code for next submission.

>> +            return NULL;
>> +        }
>> +    }
>> +
>>       k = DEVICE_CLASS(obj);
>>
>>       /* find bus */
>> @@ -498,6 +509,12 @@ DeviceState *qdev_device_add(QemuOpts *opts)
>>           qerror_report(QERR_DEVICE_INIT_FAILED, driver);
>>           return NULL;
>>       }
>> +
>> +    if (driver && current_machine &&
>> +        strcmp(driver, current_machine->cpu_device_str) == 0) {
>> +        resume_all_vcpus();
>> +    }
>
> Ditto, generic change.
>
> Hasn't this been obsoleted? Hot-added CPUs get resumed in
> qom/cpu.c:cpu_common_realizefn() now. And CPUs added with -device should
> be resumed together with machine-created CPUs from what I recall.
> If something doesn't work as expected, please explicitly say so, then we
> can fix it in its own patch and optionally backport it.
>

With this call removed, the guest hangs on a hotplug via device_add. 
That is about all I was able to determine about the situation.  Perhaps 
there is another way to solve the problem other than calling 
resume_all_vcpus()?

>> +
>>       qdev->opts = opts;
>>       return qdev;
>>   }
>> diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
>> index 23fe51f..8b92c9c 100644
>> --- a/target-s390x/cpu.c
>> +++ b/target-s390x/cpu.c
>> @@ -29,6 +29,8 @@
>>   #include "hw/hw.h"
>>   #ifndef CONFIG_USER_ONLY
>>   #include "sysemu/arch_init.h"
>> +#include "sysemu/sysemu.h"
>> +#include "hw/s390x/sclp.h"
>>   #endif
>>
>>   #define CR0_RESET       0xE0UL
>> @@ -106,6 +108,9 @@ static void s390_cpu_realizefn(DeviceState *dev, Error **errp)
>>       cpu_reset(CPU(cpu));
>>
>>       scc->parent_realize(dev, errp);
>> +
>> +    cpu_synchronize_post_init(CPU(dev));
>
> This is already done as part of parent_realize above, please drop.
>

Ok.

>> +    raise_irq_cpu_hotplug();
>
> [FTR: introduced in 3/8]
>
> Shouldn't this be conditional on DeviceState::hotplugged, just like
> cpu_synchronize_post_init() in qom/cpu.c?
>

Yes! Thanks for pointing that out. I had not noticed that "qdev_hotplug" 
existed.  This can be used where ever the hotplug case must differ from 
the startup case.  Specifically here.

>>   }
>>
>>   static void s390_cpu_initfn(Object *obj)
>> @@ -113,8 +118,9 @@ static void s390_cpu_initfn(Object *obj)
>>       CPUState *cs = CPU(obj);
>>       S390CPU *cpu = S390_CPU(obj);
>>       CPUS390XState *env = &cpu->env;
>> +    int cpu_num = s390_cpu_get_free_state_idx();
>>       static bool inited;
>> -    static int cpu_num = 0;
>> +
>>   #if !defined(CONFIG_USER_ONLY)
>>       struct tm tm;
>>   #endif
>> @@ -134,13 +140,20 @@ static void s390_cpu_initfn(Object *obj)
>>        * initial ipl */
>>       cs->halted = 1;
>>   #endif
>> -    env->cpu_num = cpu_num++;
>> +    s390_cpu_set_state(cpu_num, cpu);
>
> This function name is rather confusing here, can you find a more precise
> name?
>
> In fact, can't we replace the ipi_states[] array with QOM link<S390CPU>
> properties? All we would be doing here is adding or setting a link from
> /machine/cpu[cpu_num] (or so) to this instance.
>

I'm not sure what is meant by "QOM link<S390CPU> properties?".  I assume 
you mean add a poperty to the s390-cpu QOM object to hold the S390CPU 
pointer?  If so, I completely agree.  That would make life easier.

>> +    cs->cpu_index = cpu_num;
>
> This is after cpu_exec_init(), so fiddling with cpu_index should be OK.
>
> But perhaps you should override CPUClass::get_arch_id to return cpu_num
> directly, for cpu_exists() / cpu-add? If this is about hot-remove then
> we may need a more generic solution. At least I don't see you changing
> any cpu_index-dependent code here.
>

I'm not 100% sure what cpu_index/cpu_num is used for or how they differ. 
  Could you elaborate on the change you are requesting?

>> +    env->cpu_num = cpu_num;
>>       env->ext_index = -1;
>
>> +    env->cpu_model_str = "host";
>
> This is unneeded, cpu_model_str is only used for linux-user, where your
> -smp twiddling does not apply.
>
I'm not sure I understand.  i386 architecture sets this in 
target-i386:cpu.c:cpu_x86_create().  Why would s390 not need to set it here?

>> +    cpu->env.exception_index = EXCP_HLT;
>> +    cpu->env.storage_keys = s390_get_storage_keys_p();
>
> Moved from s390_init_cpus(), fine.
>
>>
>>       if (tcg_enabled() && !inited) {
>>           inited = true;
>>           s390x_translate_init();
>>       }
>> +
>> +    smp_cpus += 1;
>
> Won't we need some form of locking?
>

Monitor commands are processed by the main qemu thread, right? if so, 
aren't they processed in a serial fashion?  Thereby eliminating the 
possibility of concurrent updates?

Hmmmm, I'm just noticing that x86 does not seem to update smp_cpus when 
processing a hot plug.  Perhaps updating smp_cpus is not required? 
Thoughts?

> If we fiddle with a global CPU counter, we should do so in qom/cpu.c,
> not just in s390x code.
>
>>   }
>>
>>   static void s390_cpu_finalize(Object *obj)
>> @@ -152,6 +165,12 @@ static void s390_cpu_finalize(Object *obj)
>>   #endif
>>   }
>>
>> +static int s390_cpu_unplug(DeviceState *dev)
>> +{
>> +    fprintf(stderr, "Removal of CPU devices is not supported.\n");
>> +    return -1;
>> +}
>> +
>>   static const VMStateDescription vmstate_s390_cpu = {
>>       .name = "cpu",
>>       .unmigratable = 1,
>> @@ -165,6 +184,7 @@ static void s390_cpu_class_init(ObjectClass *oc, void *data)
>>
>>       scc->parent_realize = dc->realize;
>>       dc->realize = s390_cpu_realizefn;
>> +    dc->unplug = s390_cpu_unplug;
>>
>>       scc->parent_reset = cc->reset;
>>       cc->reset = s390_cpu_reset;
>
> That we should do in generic code, too.
>
> Plus I count 6x dc->unplug plus 4x k->unplug, so we should probably
> enhance the function signature with an Error** to properly return the
> error message, since printing it to stderr means it won't show up on the
> monitor console. I'll prepare a patch.

Sounds good.  Andreas, thanks very much for your thorough review.

-- 
-- Jason J. Herne (jjherne@linux.vnet.ibm.com)