From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:36437)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1YXpAQ-0001V9-EK
	for qemu-devel@nongnu.org; Tue, 17 Mar 2015 06:56:12 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1YXpAN-0007AX-7z
	for qemu-devel@nongnu.org; Tue, 17 Mar 2015 06:56:10 -0400
Received: from cantor2.suse.de ([195.135.220.15]:35198 helo=mx2.suse.de)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1YXpAM-0007AN-UB
	for qemu-devel@nongnu.org; Tue, 17 Mar 2015 06:56:07 -0400
Message-ID: <55080844.5070302@suse.de>
Date: Tue, 17 Mar 2015 11:56:04 +0100
From: =?windows-1252?Q?Andreas_F=E4rber?= <afaerber@suse.de>
MIME-Version: 1.0
References: <1426247796-1657-1-git-send-email-bharata@linux.vnet.ibm.com>
	<5507D029.8040608@suse.de> <20150317083951.GB21610@in.ibm.com>
In-Reply-To: <20150317083951.GB21610@in.ibm.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [RFC v0 PATCH] cpus: Convert cpu_index into a
	bitmap
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: bharata@linux.vnet.ibm.com
Cc: zhugh.fnst@cn.fujitsu.com, Alexander Graf <agraf@suse.de>, qemu-devel@nongnu.org, Bharata B Rao <bharata.rao@gmail.com>, imammedo@redhat.com, david@gibson.dropbear.id.au

Am 17.03.2015 um 09:39 schrieb Bharata B Rao:
> On Tue, Mar 17, 2015 at 07:56:41AM +0100, Alexander Graf wrote:
>>
>>
>> On 13.03.15 12:56, Bharata B Rao wrote:
>>> From: Bharata B Rao <bharata.rao@gmail.com>
>>>
>>> Currently CPUState.cpu_index is monotonically increasing and a newly
>>> created CPU always gets the next higher index. The next available
>>> index is calculated by counting the existing number of CPUs. This is
>>> fine as long as we only add CPUs, but there are architectures which
>>> are starting to support CPU removal too. For an architecture like Pow=
erPC
>>> which derives its CPU identifier (device tree ID) from cpu_index, the
>>> existing logic of generating cpu_index values causes problems.
>>>
>>> With the currently proposed method of handling vCPU removal by parkin=
g
>>> the vCPU fd in QEMU
>>> (Ref: http://lists.gnu.org/archive/html/qemu-devel/2015-02/msg02604.h=
tml),
>>> generating cpu_index this way will not work for PowerPC.
>>>
>>> This patch changes the way cpu_index is handed out by maintaining
>>> a bit map of the CPUs that tracks both addition and removal of CPUs.
>>>
>>> I am not sure if this is the right and an acceptable approach. The
>>> alternative is to do something similar for PowerPC alone and not
>>> depend on cpu_index.
>>>
>>> I have tested this with out-of-the-tree patches for CPU hot plug and
>>> removal on x86 and sPAPR PowerPC.
>>>
>>> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
>>> ---
>>>  exec.c                      | 39 +++++++++++++++++++++++++++++------=
----
>>>  include/exec/exec-all.h     |  1 +
>>>  target-alpha/cpu.c          |  6 ++++++
>>>  target-arm/cpu.c            |  1 +
>>>  target-cris/cpu.c           |  6 ++++++
>>>  target-i386/cpu.c           |  6 ++++++
>>>  target-lm32/cpu.c           |  6 ++++++
>>>  target-m68k/cpu.c           |  6 ++++++
>>>  target-microblaze/cpu.c     |  6 ++++++
>>>  target-mips/cpu.c           |  6 ++++++
>>>  target-moxie/cpu.c          |  6 ++++++
>>>  target-openrisc/cpu.c       |  6 ++++++
>>>  target-ppc/translate_init.c |  6 ++++++
>>>  target-s390x/cpu.c          |  1 +
>>>  target-sh4/cpu.c            |  6 ++++++
>>>  target-sparc/cpu.c          |  1 +
>>>  target-tricore/cpu.c        |  5 +++++
>>>  target-unicore32/cpu.c      |  6 ++++++
>>>  target-xtensa/cpu.c         |  6 ++++++
>>>  19 files changed, 116 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/exec.c b/exec.c
>>> index e97071a..7760f2d 100644
>>> --- a/exec.c
>>> +++ b/exec.c
>>> @@ -530,21 +530,40 @@ void tcg_cpu_address_space_init(CPUState *cpu, =
AddressSpace *as)
>>>  }
>>>  #endif
>>> =20
>>> +static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
>>> +
>>> +#ifdef CONFIG_USER_ONLY
>>> +int max_cpus =3D 1; /* TODO: Check if this is correct ? */
>>> +#endif
>>> +
>>> +static int cpu_get_free_index(void)
>>> +{
>>> +    int cpu =3D find_first_zero_bit(cpu_index_map, max_cpus);
>>> +
>>> +    if (cpu =3D=3D max_cpus) {
>>> +        fprintf(stderr, "WARNING: qemu: Trying to use more "
>>> +                        "CPUs than allowed max of %d\n", max_cpus);
>>> +        return max_cpus;
>>> +    } else {
>>> +        bitmap_set(cpu_index_map, cpu, 1);
>>> +        return cpu;
>>> +    }
>>> +}
>>> +
>>> +void cpu_exec_exit(CPUState *cpu)
>>> +{
>>> +    bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
>>> +}
>>> +
>>>  void cpu_exec_init(CPUArchState *env)
>>>  {
>>>      CPUState *cpu =3D ENV_GET_CPU(env);
>>>      CPUClass *cc =3D CPU_GET_CLASS(cpu);
>>> -    CPUState *some_cpu;
>>> -    int cpu_index;
>>> =20
>>>  #if defined(CONFIG_USER_ONLY)
>>>      cpu_list_lock();
>>>  #endif
>>> -    cpu_index =3D 0;
>>> -    CPU_FOREACH(some_cpu) {
>>> -        cpu_index++;
>>> -    }
>>> -    cpu->cpu_index =3D cpu_index;
>>> +    cpu->cpu_index =3D cpu_get_free_index();
>>>      cpu->numa_node =3D 0;
>>>      QTAILQ_INIT(&cpu->breakpoints);
>>>      QTAILQ_INIT(&cpu->watchpoints);
>>> @@ -558,16 +577,16 @@ void cpu_exec_init(CPUArchState *env)
>>>      cpu_list_unlock();
>>>  #endif
>>>      if (qdev_get_vmsd(DEVICE(cpu)) =3D=3D NULL) {
>>> -        vmstate_register(NULL, cpu_index, &vmstate_cpu_common, cpu);
>>> +        vmstate_register(NULL, cpu->cpu_index, &vmstate_cpu_common, =
cpu);
>>>      }
>>>  #if defined(CPU_SAVE_VERSION) && !defined(CONFIG_USER_ONLY)
>>> -    register_savevm(NULL, "cpu", cpu_index, CPU_SAVE_VERSION,
>>> +    register_savevm(NULL, "cpu", cpu->cpu_index, CPU_SAVE_VERSION,
>>>                      cpu_save, cpu_load, env);
>>>      assert(cc->vmsd =3D=3D NULL);
>>>      assert(qdev_get_vmsd(DEVICE(cpu)) =3D=3D NULL);
>>>  #endif
>>>      if (cc->vmsd !=3D NULL) {
>>> -        vmstate_register(NULL, cpu_index, cc->vmsd, cpu);
>>> +        vmstate_register(NULL, cpu->cpu_index, cc->vmsd, cpu);
>>>      }
>>>  }
>>> =20
>>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>>> index 8eb0db3..95fbba0 100644
>>> --- a/include/exec/exec-all.h
>>> +++ b/include/exec/exec-all.h
>>> @@ -89,6 +89,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>>>                                target_ulong pc, target_ulong cs_base,=
 int flags,
>>>                                int cflags);
>>>  void cpu_exec_init(CPUArchState *env);
>>> +void cpu_exec_exit(CPUState *cpu);
>>>  void QEMU_NORETURN cpu_loop_exit(CPUState *cpu);
>>>  int page_unprotect(target_ulong address, uintptr_t pc, void *puc);
>>>  void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_add=
r_t end,
>>> diff --git a/target-alpha/cpu.c b/target-alpha/cpu.c
>>> index a98b7d8..7c57165 100644
>>> --- a/target-alpha/cpu.c
>>> +++ b/target-alpha/cpu.c
>>> @@ -250,6 +250,11 @@ static const TypeInfo ev68_cpu_type_info =3D {
>>>      .parent =3D TYPE("ev67"),
>>>  };
>>> =20
>>> +static void alpha_cpu_finalize(Object *obj)
>>> +{
>>> +    cpu_exec_exit(CPU(obj));
>>> +}
>>> +
>>>  static void alpha_cpu_initfn(Object *obj)
>>>  {
>>>      CPUState *cs =3D CPU(obj);
>>> @@ -305,6 +310,7 @@ static const TypeInfo alpha_cpu_type_info =3D {
>>>      .parent =3D TYPE_CPU,
>>>      .instance_size =3D sizeof(AlphaCPU),
>>>      .instance_init =3D alpha_cpu_initfn,
>>> +    .instance_finalize =3D alpha_cpu_finalize,
>>
>> Would it be possible to put this into TYPE_CPU->instance_finalize inst=
ead?
>=20
> Yes possible and that would be much cleaner since I wouldn't have to to=
uch
> all archs. But it will be asymmetric in some sense as cpu_exec_init() i=
s
> called from all individual cpus' instance_init but cpu_exec_exit() will=
 be
> called from parent's (TYPE_CPU) instance_finalize. If that is fine, I s=
hall
> post v2 with this change.

Could you check: Wasn't there a patch from Fujitsu to move
cpu_exec_init() to generic code? If both were generic, that would be
fine. If that is problematic, I would accept the mismatch as long as it
is "safe". That is, instance_finalize needs to handle any state of the
object, and I think these two are better suited for realize/unrealize
than instance_init/instance_finalize.

The unanswered question around the Fujitsu patches is mostly whether
some of such code movements impact timing in the sense of breaking
migration compatibility by changing the relative order.

Probably something my CPU test case should be extended with.

Regards,
Andreas

--=20
SUSE Linux GmbH, Maxfeldstr. 5, 90409 N=FCrnberg, Germany
GF: Felix Imend=F6rffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
Graham Norton; HRB 21284 (AG N=FCrnberg)