Re: [Qemu-devel] [RFC v0 PATCH] cpus: Convert cpu_index into a bitmap

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Bharata B Rao <bharata@linux.vnet.ibm.com>
To: "Andreas Färber" <afaerber@suse.de>
Cc: zhugh.fnst@cn.fujitsu.com, ehabkost@redhat.com,
	Alexander Graf <agraf@suse.de>,
	qemu-devel@nongnu.org, imammedo@redhat.com,
	david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [RFC v0 PATCH] cpus: Convert cpu_index into a bitmap
Date: Thu, 7 May 2015 15:05:45 +0530	[thread overview]
Message-ID: <20150507093545.GY18380@in.ibm.com> (raw)
In-Reply-To: <20150318062004.GC21610@in.ibm.com>

On Wed, Mar 18, 2015 at 11:50:04AM +0530, Bharata B Rao wrote:
> On Tue, Mar 17, 2015 at 11:56:04AM +0100, Andreas Färber wrote:
> > Am 17.03.2015 um 09:39 schrieb Bharata B Rao:
> > > On Tue, Mar 17, 2015 at 07:56:41AM +0100, Alexander Graf wrote:
> > >>
> > >>
> > >> On 13.03.15 12:56, Bharata B Rao wrote:
> > >>> From: Bharata B Rao <bharata.rao@gmail.com>
> > >>>
> > >>> Currently CPUState.cpu_index is monotonically increasing and a newly
> > >>> created CPU always gets the next higher index. The next available
> > >>> index is calculated by counting the existing number of CPUs. This is
> > >>> fine as long as we only add CPUs, but there are architectures which
> > >>> are starting to support CPU removal too. For an architecture like PowerPC
> > >>> which derives its CPU identifier (device tree ID) from cpu_index, the
> > >>> existing logic of generating cpu_index values causes problems.
> > >>>
> > >>> With the currently proposed method of handling vCPU removal by parking
> > >>> the vCPU fd in QEMU
> > >>> (Ref: http://lists.gnu.org/archive/html/qemu-devel/2015-02/msg02604.html),
> > >>> generating cpu_index this way will not work for PowerPC.
> > >>>
> > >>> This patch changes the way cpu_index is handed out by maintaining
> > >>> a bit map of the CPUs that tracks both addition and removal of CPUs.
> > >>>
> > >>> I am not sure if this is the right and an acceptable approach. The
> > >>> alternative is to do something similar for PowerPC alone and not
> > >>> depend on cpu_index.
> > >>>
> > >>> I have tested this with out-of-the-tree patches for CPU hot plug and
> > >>> removal on x86 and sPAPR PowerPC.
> > >>>
> > >>> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > >>> ---
> > >>>  exec.c                      | 39 +++++++++++++++++++++++++++++----------
> > >>>  include/exec/exec-all.h     |  1 +
> > >>>  target-alpha/cpu.c          |  6 ++++++
> > >>>  target-arm/cpu.c            |  1 +
> > >>>  target-cris/cpu.c           |  6 ++++++
> > >>>  target-i386/cpu.c           |  6 ++++++
> > >>>  target-lm32/cpu.c           |  6 ++++++
> > >>>  target-m68k/cpu.c           |  6 ++++++
> > >>>  target-microblaze/cpu.c     |  6 ++++++
> > >>>  target-mips/cpu.c           |  6 ++++++
> > >>>  target-moxie/cpu.c          |  6 ++++++
> > >>>  target-openrisc/cpu.c       |  6 ++++++
> > >>>  target-ppc/translate_init.c |  6 ++++++
> > >>>  target-s390x/cpu.c          |  1 +
> > >>>  target-sh4/cpu.c            |  6 ++++++
> > >>>  target-sparc/cpu.c          |  1 +
> > >>>  target-tricore/cpu.c        |  5 +++++
> > >>>  target-unicore32/cpu.c      |  6 ++++++
> > >>>  target-xtensa/cpu.c         |  6 ++++++
> > >>>  19 files changed, 116 insertions(+), 10 deletions(-)
> > >>>
> > >>> diff --git a/exec.c b/exec.c
> > >>> index e97071a..7760f2d 100644
> > >>> --- a/exec.c
> > >>> +++ b/exec.c
> > >>> @@ -530,21 +530,40 @@ void tcg_cpu_address_space_init(CPUState *cpu, AddressSpace *as)
> > >>>  }
> > >>>  #endif
> > >>>  
> > >>> +static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > >>> +
> > >>> +#ifdef CONFIG_USER_ONLY
> > >>> +int max_cpus = 1; /* TODO: Check if this is correct ? */
> > >>> +#endif
> > >>> +
> > >>> +static int cpu_get_free_index(void)
> > >>> +{
> > >>> +    int cpu = find_first_zero_bit(cpu_index_map, max_cpus);
> > >>> +
> > >>> +    if (cpu == max_cpus) {
> > >>> +        fprintf(stderr, "WARNING: qemu: Trying to use more "
> > >>> +                        "CPUs than allowed max of %d\n", max_cpus);
> > >>> +        return max_cpus;
> > >>> +    } else {
> > >>> +        bitmap_set(cpu_index_map, cpu, 1);
> > >>> +        return cpu;
> > >>> +    }
> > >>> +}
> > >>> +
> > >>> +void cpu_exec_exit(CPUState *cpu)
> > >>> +{
> > >>> +    bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > >>> +}
> > >>> +
> > >>>  void cpu_exec_init(CPUArchState *env)
> > >>>  {
> > >>>      CPUState *cpu = ENV_GET_CPU(env);
> > >>>      CPUClass *cc = CPU_GET_CLASS(cpu);
> > >>> -    CPUState *some_cpu;
> > >>> -    int cpu_index;
> > >>>  
> > >>>  #if defined(CONFIG_USER_ONLY)
> > >>>      cpu_list_lock();
> > >>>  #endif
> > >>> -    cpu_index = 0;
> > >>> -    CPU_FOREACH(some_cpu) {
> > >>> -        cpu_index++;
> > >>> -    }
> > >>> -    cpu->cpu_index = cpu_index;
> > >>> +    cpu->cpu_index = cpu_get_free_index();
> > >>>      cpu->numa_node = 0;
> > >>>      QTAILQ_INIT(&cpu->breakpoints);
> > >>>      QTAILQ_INIT(&cpu->watchpoints);
> > >>> @@ -558,16 +577,16 @@ void cpu_exec_init(CPUArchState *env)
> > >>>      cpu_list_unlock();
> > >>>  #endif
> > >>>      if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
> > >>> -        vmstate_register(NULL, cpu_index, &vmstate_cpu_common, cpu);
> > >>> +        vmstate_register(NULL, cpu->cpu_index, &vmstate_cpu_common, cpu);
> > >>>      }
> > >>>  #if defined(CPU_SAVE_VERSION) && !defined(CONFIG_USER_ONLY)
> > >>> -    register_savevm(NULL, "cpu", cpu_index, CPU_SAVE_VERSION,
> > >>> +    register_savevm(NULL, "cpu", cpu->cpu_index, CPU_SAVE_VERSION,
> > >>>                      cpu_save, cpu_load, env);
> > >>>      assert(cc->vmsd == NULL);
> > >>>      assert(qdev_get_vmsd(DEVICE(cpu)) == NULL);
> > >>>  #endif
> > >>>      if (cc->vmsd != NULL) {
> > >>> -        vmstate_register(NULL, cpu_index, cc->vmsd, cpu);
> > >>> +        vmstate_register(NULL, cpu->cpu_index, cc->vmsd, cpu);
> > >>>      }
> > >>>  }
> > >>>  
> > >>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> > >>> index 8eb0db3..95fbba0 100644
> > >>> --- a/include/exec/exec-all.h
> > >>> +++ b/include/exec/exec-all.h
> > >>> @@ -89,6 +89,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> > >>>                                target_ulong pc, target_ulong cs_base, int flags,
> > >>>                                int cflags);
> > >>>  void cpu_exec_init(CPUArchState *env);
> > >>> +void cpu_exec_exit(CPUState *cpu);
> > >>>  void QEMU_NORETURN cpu_loop_exit(CPUState *cpu);
> > >>>  int page_unprotect(target_ulong address, uintptr_t pc, void *puc);
> > >>>  void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
> > >>> diff --git a/target-alpha/cpu.c b/target-alpha/cpu.c
> > >>> index a98b7d8..7c57165 100644
> > >>> --- a/target-alpha/cpu.c
> > >>> +++ b/target-alpha/cpu.c
> > >>> @@ -250,6 +250,11 @@ static const TypeInfo ev68_cpu_type_info = {
> > >>>      .parent = TYPE("ev67"),
> > >>>  };
> > >>>  
> > >>> +static void alpha_cpu_finalize(Object *obj)
> > >>> +{
> > >>> +    cpu_exec_exit(CPU(obj));
> > >>> +}
> > >>> +
> > >>>  static void alpha_cpu_initfn(Object *obj)
> > >>>  {
> > >>>      CPUState *cs = CPU(obj);
> > >>> @@ -305,6 +310,7 @@ static const TypeInfo alpha_cpu_type_info = {
> > >>>      .parent = TYPE_CPU,
> > >>>      .instance_size = sizeof(AlphaCPU),
> > >>>      .instance_init = alpha_cpu_initfn,
> > >>> +    .instance_finalize = alpha_cpu_finalize,
> > >>
> > >> Would it be possible to put this into TYPE_CPU->instance_finalize instead?
> > > 
> > > Yes possible and that would be much cleaner since I wouldn't have to touch
> > > all archs. But it will be asymmetric in some sense as cpu_exec_init() is
> > > called from all individual cpus' instance_init but cpu_exec_exit() will be
> > > called from parent's (TYPE_CPU) instance_finalize. If that is fine, I shall
> > > post v2 with this change.
> > 
> > Could you check: Wasn't there a patch from Fujitsu to move
> > cpu_exec_init() to generic code? If both were generic, that would be
> > fine. If that is problematic, I would accept the mismatch as long as it
> > is "safe". That is, instance_finalize needs to handle any state of the
> > object,
> 
> There is a patch from Zhu to move only vmstate_register related bits
> from cpu_exec_init to cpu_common_realizefn.
> 
> http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg01550.html
> 
> > and I think these two are better suited for realize/unrealize
> > than instance_init/instance_finalize.
> 
> And Eduardo has a patch to move cpu_exec_init call from instance_init
> to realize for target-i386.
> 
> https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg01056.html
> 
> So if the preferred way is to call cpu_exec_init from realize, then
> Eduardo - Can you do this for all archs ? My limited testing shows
> that it (moving cpu_exec_init from instance_init to realize) works for
> sPAPR PowerPC too, but not sure about other targets.
> 
> After there is consensus on the above two patches, I can do the
> cpu_index to bitmap changes.

I was planning to re-post this patch now, but wanted to know the plan
regarding the related two patchsets that I have mentioned above.

1. Patch by Eduardo to move cpu_exec_init from instance_init to realizefn
2. Patch by Zhu to move vmstate_register related bits from cpu_exec_init
   to cpu_common_realizefn.

I don't see any more updates on this, are these obsolete or still under
consideration ?

Regards,
Bharata.

next prev parent reply	other threads:[~2015-05-07  9:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-13 11:56 [Qemu-devel] [RFC v0 PATCH] cpus: Convert cpu_index into a bitmap Bharata B Rao
2015-03-17  6:56 ` Alexander Graf
2015-03-17  8:39   ` Bharata B Rao
2015-03-17 10:56     ` Andreas Färber
2015-03-18  6:20       ` Bharata B Rao
2015-03-19 13:58         ` Eduardo Habkost
2015-05-07  9:35         ` Bharata B Rao [this message]
2015-05-07 15:33           ` Eduardo Habkost
2015-03-17 10:51 ` Andreas Färber
2015-03-18  6:28   ` Bharata B Rao
2015-03-18  0:49 ` David Gibson
2015-03-18  6:34   ` Bharata B Rao
2015-03-19  4:43     ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150507093545.GY18380@in.ibm.com \
    --to=bharata@linux.vnet.ibm.com \
    --cc=afaerber@suse.de \
    --cc=agraf@suse.de \
    --cc=david@gibson.dropbear.id.au \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhugh.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).