Re: [PATCH v7] kvm: make vcpu life cycle separated from kvm instance

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Avi Kivity <avi@redhat.com>
To: Liu ping fan <kernelfans@gmail.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	aliguori@us.ibm.com, gleb@redhat.com, mtosatti@redhat.com,
	xiaoguangrong.eric@gmail.com, jan.kiszka@web.de,
	yoshikawa.takuya@oss.ntt.co.jp, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH v7] kvm: make vcpu life cycle separated from kvm instance
Date: Sun, 15 Jan 2012 15:37:19 +0200	[thread overview]
Message-ID: <4F12D68F.7000701@redhat.com> (raw)
In-Reply-To: <CAFgQCTv1zqo+gPEWtGEcEwsVzeOM-otDK-W51LLphBY5q+dnbg@mail.gmail.com>

On 01/15/2012 03:17 PM, Liu ping fan wrote:
> On Thu, Jan 12, 2012 at 8:37 PM, Avi Kivity <avi@redhat.com> wrote:
> > On 01/07/2012 04:55 AM, Liu Ping Fan wrote:
> >> From: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> >>
> >> Currently, vcpu will be destructed only after kvm instance is
> >> destroyed. This result to vcpu keep idle in kernel, but can not
> >> be freed when it is unplugged in guest.
> >>
> >> Change this to vcpu's destruction before kvm instance, so vcpu MUST
> >
> > Must?
> >
> Yes, in kvm_arch_vcpu_destruct-->kvm_put_kvm(kvm); so after all vcpu
> destroyed, then can kvm instance

Oh.  Words like MUST imply that the user has to do something different. 
It's just that the normal order of operations changes.

> >> and CAN be destroyed before kvm instance. By this way, we can remove
> >> vcpu when guest does not need it any longer.
> >>
> >> TODO: push changes to other archs besides x86.
> >>
> >> -Rename kvm_vcpu_zap to kvm_vcpu_destruct and so on.
> >
> > kvm_vcpu_destroy.
> >
> The name "kvm_arch_vcpu_destroy" is already occupied in different arch.

It's actually in all archs.  But having both kvm_arch_vcpu_destroy() and
kvm_arch_vcpu_destruct() isn't going to make the code more
understandable, need to merge the two, or find different names.

> So change
>   kvm_vcpu_zap -> kvm_vcpu_destruct
>   kvm_vcpu_arch_zap -> kvm_vcpu_arch_destruct



> >> -     struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
> >> +     struct list_head vcpus;
> >
> > This has the potential for a slight performance regression by bouncing
> > an extra cache line, but it's acceptable IMO.  We can always introduce
>
> Sorry, not clear about this scene, do you mean that the changing of
> vcpu link list will cause the invalid of cache between SMP? But the
> link list is not changed often.

No, I mean that kvm_for_each_vcpu() now has to bounce a cacheline for
each vcpu, in order to read the link.

> >> +             kvm_for_each_vcpu(vcpu, kvm) {
> >> +                     if (kvm->last_boosted_vcpu_id < 0 && !pass) {
> >> +                             pass = 1;
> >> +                             break;
> >> +                     }
> >> +                     if (!pass && !firststart &&
> >> +                         vcpu->vcpu_id != kvm->last_boosted_vcpu_id) {
> >> +                             continue;
> >> +                     } else if (!pass && !firststart) {
> >> +                             firststart = 1;
> >>                               continue;
> >> -                     } else if (pass && i > last_boosted_vcpu)
> >> +                     } else if (pass && !lastone) {
> >> +                             if (vcpu->vcpu_id == kvm->last_boosted_vcpu_id)
> >> +                                     lastone = 1;
> >> +                     } else if (pass && lastone)
> >>                               break;
> >> +
> >
> > Seems like a large change.  Is this because the vcpu list is unordered?
> > Maybe it's better to order it.
> >
> To find the last boosted vcpu (I guest it is more likely the lock
> holder), we must enumerate the vcpu link list. While implemented by
> kvm->vcpus[], it is more facile.

Please simplify this code, it's pretty complicated.

> >> +
> >>                       if (yield_to(task, 1)) {
> >>                               put_task_struct(task);
> >> -                             kvm->last_boosted_vcpu = i;
> >> +                             mutex_lock(&kvm->lock);
> >> +                             kvm->last_boosted_vcpu_id = vcpu->vcpu_id;
> >> +                             mutex_unlock(&kvm->lock);
> >
> > Why take the mutex?
> >
> In kvm_vcpu_release()
>        mutex_lock(&kvm->lock);
>        if (kvm->last_boosted_vcpu_id == vcpu->vcpu_id)
>
> ----------------------------------------->CAN NOT break
>                kvm->last_boosted_vcpu_id = -1;
>        mutex_unlock(&kvm->lock);

It's not pretty taking a vm-wide lock here.  Just make the code
resilient to incorrect vcpu_id.  If it doesn't find
last_boosted_vcpu_id, it should just pick something, like the first or
last vcpu in the list.

>
> >>  static int kvm_vcpu_release(struct inode *inode, struct file *filp)
> >>  {
> >>       struct kvm_vcpu *vcpu = filp->private_data;
> >> +     struct kvm *kvm = vcpu->kvm;
> >> +     filp->private_data = NULL;
> >> +
> >> +     mutex_lock(&kvm->lock);
> >> +     list_del_rcu(&vcpu->list);
> >> +     atomic_dec(&kvm->online_vcpus);
> >> +     mutex_unlock(&kvm->lock);
> >> +     synchronize_srcu_expedited(&kvm->srcu);
> >
> > Why _expedited?
> >
> > Even better would be call_srcu() but it doesn't exist.
> >
> > I think we can actually use regular rcu.  The only user that blocks is
> > kvm_vcpu_on_spin(), yes? so we can convert the vcpu to a task using
> > get_pid_task(), then, outside the rcu lock, call yield_to().
> >
> Yes,  kvm_vcpu_on_spin() is the only one. But I think if outside the
> rcu lock, call yield_to(), it will be like the following
>
> again:
>     rcu_lock()
>     kvm_for_each_vcpu(){
>     ......
>     }
>     rcu_unlock()
>     if (yield_to(task, 1)) {
>     .....
>     } else
>         goto again;
>
> We must travel through the linked list again to find the next vcpu.

Annoying... maybe we should use an array instead of a list after all.

>
> >
> >> +static struct kvm_vcpu *kvm_vcpu_create(struct kvm *kvm, u32 id)
> >> +{
> >> +     struct kvm_vcpu *vcpu;
> >> +     vcpu = kvm_arch_vcpu_create(kvm, id);
> >> +     if (IS_ERR(vcpu))
> >> +             return vcpu;
> >> +     INIT_LIST_HEAD(&vcpu->list);
> >
> > Really needed?
> >
> Yes, it is unnecessary

Why?  list_add_rcu() will overwrite it anyway.

-- 
error compiling committee.c: too many arguments to function

next prev parent reply	other threads:[~2012-01-15 13:37 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-25  2:35 [PATCH 0] A series patches for kvm&qemu to enable vcpu destruction in kvm Liu Ping Fan
2011-11-25  2:35 ` [PATCH 1/2] kvm: make vcpu life cycle separated from kvm instance Liu Ping Fan
2011-11-27 10:36   ` Avi Kivity
2011-12-02  6:26     ` [PATCH] " Liu Ping Fan
2011-12-02 18:26       ` Jan Kiszka
2011-12-04 11:53         ` Liu ping fan
2011-12-04 12:10           ` Gleb Natapov
2011-12-05  5:39             ` Liu ping fan
2011-12-05  8:41               ` Gleb Natapov
2011-12-06  6:54                 ` Liu ping fan
2011-12-06  8:14                   ` Gleb Natapov
2011-12-04 10:23       ` Avi Kivity
2011-12-05  5:29         ` Liu ping fan
2011-12-05  9:30           ` Avi Kivity
2011-12-05  9:42             ` Gleb Natapov
2011-12-05  9:58               ` Avi Kivity
2011-12-05 10:18                 ` Gleb Natapov
2011-12-05 10:22                   ` Avi Kivity
2011-12-05 10:40                     ` Gleb Natapov
2011-12-09  5:23       ` [PATCH V2] " Liu Ping Fan
2011-12-09 14:23         ` Gleb Natapov
2011-12-12  2:41           ` [PATCH v3] " Liu Ping Fan
2011-12-12 12:54             ` Gleb Natapov
2011-12-13  9:29               ` Liu ping fan
2011-12-13  9:47                 ` Gleb Natapov
2011-12-13 11:36             ` Marcelo Tosatti
2011-12-13 11:54               ` Gleb Natapov
2011-12-15  3:21               ` Liu ping fan
2011-12-15  4:28                 ` [PATCH v4] " Liu Ping Fan
2011-12-15  5:33                   ` Xiao Guangrong
2011-12-15  6:53                     ` Liu ping fan
2011-12-15  8:25                       ` Xiao Guangrong
2011-12-15  8:57                         ` Xiao Guangrong
2011-12-15  6:48                   ` Takuya Yoshikawa
2011-12-16  9:38                     ` Marcelo Tosatti
2011-12-17  3:57                     ` Liu ping fan
2011-12-19  1:16                       ` Takuya Yoshikawa
2011-12-15  9:10                   ` Gleb Natapov
2011-12-16  7:50                     ` Liu ping fan
2011-12-15  8:33                 ` [PATCH v3] " Gleb Natapov
2011-12-15  9:06                   ` Liu ping fan
2011-12-15  9:08                     ` Gleb Natapov
2011-12-17  3:19             ` [PATCH v5] " Liu Ping Fan
2011-12-26 11:09               ` Gleb Natapov
2011-12-26 11:17                 ` Avi Kivity
2011-12-26 11:21                   ` Gleb Natapov
2011-12-27  7:53                 ` Liu ping fan
2011-12-27  8:38               ` [PATCH v6] " Liu Ping Fan
2011-12-27 11:22                 ` Takuya Yoshikawa
2011-12-28  6:54                   ` Liu ping fan
2011-12-28  9:53                     ` Avi Kivity
2011-12-29 14:03                       ` Liu ping fan
2011-12-29 14:31                         ` Avi Kivity
2012-01-05  9:35                           ` Liu ping fan
2011-12-28 10:29                     ` Takuya Yoshikawa
2011-12-28  9:53                 ` Avi Kivity
2011-12-28  9:54                   ` Avi Kivity
2011-12-28 10:19                     ` Takuya Yoshikawa
2011-12-28 10:28                       ` Avi Kivity
2012-01-07  2:55               ` [PATCH v7] " Liu Ping Fan
2012-01-12 12:37                 ` Avi Kivity
2012-01-15 13:17                   ` Liu ping fan
2012-01-15 13:37                     ` Avi Kivity [this message]
2011-11-25 17:54 ` [PATCH 0] A series patches for kvm&qemu to enable vcpu destruction in kvm Jan Kiszka
2011-11-27  3:07   ` Liu ping fan
2011-11-27  2:42 ` [PATCH 2/2] kvm: exit to userspace with reason KVM_EXIT_VCPU_DEAD Liu Ping Fan
2011-11-27 10:36   ` Avi Kivity
2011-11-27 10:50     ` Gleb Natapov
2011-11-28  7:16       ` [Qemu-devel] " Liu ping fan
2011-11-28  8:46         ` Gleb Natapov
2011-11-27  2:45 ` [PATCH 1/5] QEMU Add cpu_phyid_to_cpu() to map cpu phyid to CPUState Liu Ping Fan
2011-11-27  2:45 ` [PATCH 2/5] QEMU Add cpu_free() to support arch related CPUState release Liu Ping Fan
2011-11-27  2:45 ` [PATCH 3/5] QEMU Introduce a pci device "cpustate" to get CPU_DEAD event in guest Liu Ping Fan
2011-11-27 10:56   ` Gleb Natapov
2011-11-27  2:45 ` [PATCH 4/5] QEMU Release vcpu and finally exit vcpu thread safely Liu Ping Fan
2011-11-27  2:45 ` [PATCH 5/5] QEMU tmp patches for linux-header files Liu Ping Fan
2011-11-27  2:47 ` [PATCH] virtio: add a pci driver to notify host the CPU_DEAD event Liu Ping Fan
2011-11-27 11:10   ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F12D68F.7000701@redhat.com \
    --to=avi@redhat.com \
    --cc=aliguori@us.ibm.com \
    --cc=gleb@redhat.com \
    --cc=jan.kiszka@web.de \
    --cc=kernelfans@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=riel@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yoshikawa.takuya@oss.ntt.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).