Linux virtualization list
 help / color / mirror / Atom feed
* Re: [PATCH RFC V3 2/4] kvm hypervisor : Add a hypercall to KVM hypervisor to support pv-ticketlocks
From: Avi Kivity @ 2011-12-07 12:47 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Raghavendra K T, x86, Peter Zijlstra, Virtualization,
	H. Peter Anvin, Stefano Stabellini, Xen, Dave Jiang, KVM,
	Raghavendra K T, Ingo Molnar, Rik van Riel, Konrad Rzeszutek Wilk,
	Srivatsa Vaddagiri, Jeremy Fitzhardinge, Sasha Levin, Sedat Dilek,
	Thomas Gleixner, Yinghai Lu, Greg Kroah-Hartman, LKML,
	Dave Hansen, Suzuk
In-Reply-To: <20111207123330.GA32212@amt.cnet>

On 12/07/2011 02:33 PM, Marcelo Tosatti wrote:
> > 
> > Also Avi pointed that, logically kvm_arch_vcpu_ioctl_set_mpstate should
> > be called only in vcpu thread, so after further debugging, I noticed
> > that,  setting vcpuN->mp_state = KVM_MP_STATE_RUNNABLE; is not
> > necessary.
> > I 'll remove that in the next patch. Thanks for pointing.
>
> In fact you don't need kvm_arch_vcpu_ioctl_set_mpstate either, only the
> new "kicked" flag.

If we have a kicked flag, it becomes necessary to live migrate it.

Maybe we can change KVM_GET_MP_STATE to fold the kicked flag into the mp
state (converting HALTED into RUNNABLE).

Also I think we can keep the kicked flag in vcpu->requests, no need for
new storage.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [PATCH RFC V3 2/4] kvm hypervisor : Add a hypercall to KVM hypervisor to support pv-ticketlocks
From: Marcelo Tosatti @ 2011-12-07 12:33 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: x86, Peter Zijlstra, Virtualization, H. Peter Anvin,
	Stefano Stabellini, Xen, Dave Jiang, KVM, Raghavendra K T,
	Ingo Molnar, Avi Kivity, Rik van Riel, Konrad Rzeszutek Wilk,
	Srivatsa Vaddagiri, Jeremy Fitzhardinge, Sasha Levin, Sedat Dilek,
	Thomas Gleixner, Yinghai Lu, Greg Kroah-Hartman, LKML,
	Dave Hansen, Suzuki Poulose <suzuki>
In-Reply-To: <4EDF5413.1030107@linux.vnet.ibm.com>

On Wed, Dec 07, 2011 at 05:24:59PM +0530, Raghavendra K T wrote:
> On 12/07/2011 04:18 PM, Marcelo Tosatti wrote:
> >On Wed, Nov 30, 2011 at 02:29:59PM +0530, Raghavendra K T wrote:
> >>
> >>+/*
> >>+ * kvm_pv_kick_cpu_op:  Kick a vcpu.
> >>+ *
> >>+ * @cpu - vcpu to be kicked.
> >>+ */
> >>+static void kvm_pv_kick_cpu_op(struct kvm *kvm, int cpu)
> >>+{
> >>+	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, cpu);
> >>+	struct kvm_mp_state mp_state;
> >>+
> >>+	mp_state.mp_state = KVM_MP_STATE_RUNNABLE;
> >
> >Since vcpu->mp_state is not protected by a lock, this is potentially racy. For example:
> >
> >CPU0						CPU1
> >kvm_pv_kick_cpu_op				running vcpuN
> >vcpuN->mp_state = KVM_MP_STATE_RUNNABLE;
> >						kvm_emulate_halt
> >						vcpuN->mp_state = KVM_MP_STATE_HALTED
> >
> >Is it harmless to lose a kick?
> >
> 
> Yes you are right. It was potentially racy and it was harmful too!.
> I had observed that it was stalling the CPU before I introduced
> kicked flag.
> 
> But now,
> 
> vcpu->kicked = 1  ==> kvm_make_request(KVM_REQ_UNHALT, vcpu); ==>

Ok, please use a more descriptive name, such as "pvlock_kicked" or
something.

> 
> __vcpu_run() ==> kvm_check_request(KVM_REQ_UNHALT, vcpu) ==>
> 
> vcpuN->mp_state = KVM_MP_STATE_RUNNABLE; so eventually we will end up
> in RUNNABLE.
> 
> Also Avi pointed that, logically kvm_arch_vcpu_ioctl_set_mpstate should
> be called only in vcpu thread, so after further debugging, I noticed
> that,  setting vcpuN->mp_state = KVM_MP_STATE_RUNNABLE; is not
> necessary.
> I 'll remove that in the next patch. Thanks for pointing.

In fact you don't need kvm_arch_vcpu_ioctl_set_mpstate either, only the
new "kicked" flag.

> 
> 
> 			

^ permalink raw reply

* Re: [net-next RFC PATCH 5/5] virtio-net: flow director support
From: Jason Wang @ 2011-12-07 12:10 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: krkumar2, kvm, mst, netdev, virtualization, levinsasha928,
	bhutchings
In-Reply-To: <CAJSP0QUGWCf5WHEeCXzqZeF2CvpycxrGo-uPSfpWD1rWD3zeSg@mail.gmail.com>

On 12/07/2011 05:08 PM, Stefan Hajnoczi wrote:
[...]
>> >  Consider the complexity of the host nic each with their own steering
>> >  features,  this series make the first step with minimal effort to try to let
>> >  guest driver and host tap/macvtap co-operate like what physical nic does.
>> >  There may be other method, but performance numbers is also needed to give
>> >  the answer.
> I agree that performance results for this need to be shown.
>
> My original point is really that it's not a good idea to take
> individual steps without a good big picture because this will change
> the virtio-net device specification.  If this turns out to be a dead
> end then hosts will need to continue to support the interface forever
> (legacy guests could still try to use it).  So please first explain
> what the full stack picture is going to look like and how you think it
> will lead to better performance.  You don't need to have all the code
> or evidence, but just enough explanation so we see where this is all
> going.
I think I mention too little in the cover message.

There's no much changes with Krishna's series except the method that 
choosing a rx virtqueue. Since original series use different hash 
methods in host (rxhash) and guest (txhash), a different virtqueue were 
chose for a flow which could lead packets of a flow to be handled by 
different vhost thread and vcpu. This may damage the performance.

This series tries to let one vhost thread to process the packets of a 
flow and also let the packets to be sent directly to a vcpu local to the 
thread process the data. This is done by letting guest tell the desired 
queue form which it want to receive the pakcet of a dedicated flow.

So passing the hash from host to guest is needed to get the same hash in 
the two sides. Then a guest programmable hash to queue table were 
introduced and guest co-operate with the host through accelerate RFS in 
guest.

^ permalink raw reply

* Re: [PATCH RFC V3 2/4] kvm hypervisor : Add a hypercall to KVM hypervisor to support pv-ticketlocks
From: Raghavendra K T @ 2011-12-07 11:54 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: x86, Peter Zijlstra, Virtualization, H. Peter Anvin,
	Stefano Stabellini, Xen, Dave Jiang, KVM, Raghavendra K T,
	Ingo Molnar, Avi Kivity, Rik van Riel, Konrad Rzeszutek Wilk,
	Srivatsa Vaddagiri, Jeremy Fitzhardinge, Sasha Levin, Sedat Dilek,
	Thomas Gleixner, Yinghai Lu, Greg Kroah-Hartman, LKML,
	Dave Hansen, Suzuki Poulose <suzuki>
In-Reply-To: <20111207104849.GA24849@amt.cnet>

On 12/07/2011 04:18 PM, Marcelo Tosatti wrote:
> On Wed, Nov 30, 2011 at 02:29:59PM +0530, Raghavendra K T wrote:
>>
>> +/*
>> + * kvm_pv_kick_cpu_op:  Kick a vcpu.
>> + *
>> + * @cpu - vcpu to be kicked.
>> + */
>> +static void kvm_pv_kick_cpu_op(struct kvm *kvm, int cpu)
>> +{
>> +	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, cpu);
>> +	struct kvm_mp_state mp_state;
>> +
>> +	mp_state.mp_state = KVM_MP_STATE_RUNNABLE;
>
> Since vcpu->mp_state is not protected by a lock, this is potentially racy. For example:
>
> CPU0						CPU1
> kvm_pv_kick_cpu_op				running vcpuN
> vcpuN->mp_state = KVM_MP_STATE_RUNNABLE;
> 						kvm_emulate_halt
> 						vcpuN->mp_state = KVM_MP_STATE_HALTED
>
> Is it harmless to lose a kick?
>

Yes you are right. It was potentially racy and it was harmful too!. I 
had observed that it was stalling the CPU before I introduced kicked flag.

But now,

vcpu->kicked = 1  ==> kvm_make_request(KVM_REQ_UNHALT, vcpu); ==>

__vcpu_run() ==> kvm_check_request(KVM_REQ_UNHALT, vcpu) ==>

vcpuN->mp_state = KVM_MP_STATE_RUNNABLE; so eventually we will end up
in RUNNABLE.

Also Avi pointed that, logically kvm_arch_vcpu_ioctl_set_mpstate should
be called only in vcpu thread, so after further debugging, I noticed
that,  setting vcpuN->mp_state = KVM_MP_STATE_RUNNABLE; is not
necessary.
I 'll remove that in the next patch. Thanks for pointing.


			

^ permalink raw reply

* Re: [PATCH v4 09/12] virtio: net: Add freeze, restore handlers to support S4
From: Michael S. Tsirkin @ 2011-12-07 11:46 UTC (permalink / raw)
  To: Amit Shah; +Cc: linux-kernel, levinsasha928, Virtualization List
In-Reply-To: <20111207103529.GG4651@amit-x200.redhat.com>

On Wed, Dec 07, 2011 at 04:05:29PM +0530, Amit Shah wrote:
> On (Wed) 07 Dec 2011 [12:28:24], Michael S. Tsirkin wrote:
> > On Wed, Dec 07, 2011 at 01:18:47AM +0530, Amit Shah wrote:
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 697a0fc..1378f3c 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -1151,6 +1151,38 @@ static void __devexit virtnet_remove(struct virtio_device *vdev)
> > >  	free_netdev(vi->dev);
> > >  }
> > >  
> > > +#ifdef CONFIG_PM
> > > +static int virtnet_freeze(struct virtio_device *vdev)
> > > +{
> > > +	struct virtnet_info *vi = vdev->priv;
> > > +
> > > +	netif_device_detach(vi->dev);
> > > +	if (netif_running(vi->dev))
> > > +		napi_disable(&vi->napi);
> > > +
> > 
> > Could refill_work still be running at this point?
> 
> Yes, it could.  So moving the cancel_delayed_work_sync() before
> disabling napi would work fine?

No, because napi poll can schedule that.
Further, refill can reschedule itself.

It also looks like we have a bug in virtio net cleanup now:
cancel_delayed_work_sync is called after unregister, so
it will be calling napi API on an invalid device.
And, if it schedules itself it will run after device is gone.

I think we need some locking to fix this.

>  Anything else that might similar treatment?
> 
> 		Amit



-- 
MST

^ permalink raw reply

* Re: [net-next RFC PATCH 0/5] Series short description
From: Jason Wang @ 2011-12-07 11:31 UTC (permalink / raw)
  To: Rusty Russell
  Cc: krkumar2, kvm, mst, netdev, virtualization, levinsasha928,
	bhutchings
In-Reply-To: <87ty5cj0sw.fsf@rustcorp.com.au>

On 12/07/2011 03:30 PM, Rusty Russell wrote:
> On Mon, 05 Dec 2011 16:58:37 +0800, Jason Wang<jasowang@redhat.com>  wrote:
>> multiple queue virtio-net: flow steering through host/guest cooperation
>>
>> Hello all:
>>
>> This is a rough series adds the guest/host cooperation of flow
>> steering support based on Krish Kumar's multiple queue virtio-net
>> driver patch 3/3 (http://lwn.net/Articles/467283/).
> Is there a real (physical) device which does this kind of thing?  How do
> they do it?  Can we copy them?
>
> Cheers,
> Rusty.
As far as I see, ixgbe and sfc have similar but much more sophisticated 
mechanism.

The idea was originally suggested by Ben and it was just borrowed form 
those real physical nic cards who can dispatch packets based on their 
hash. All of theses cards can filter the flow based on the hash of 
L2/L3/L4 header and the stack would tell the card which queue should 
this flow goes.

So in host, a simple hash to queue table were introduced in tap/macvtap 
and in guest, the guest driver would tell the desired queue of a flow 
through changing this table.

^ permalink raw reply

* Re: [net-next RFC PATCH 5/5] virtio-net: flow director support
From: Jason Wang @ 2011-12-07 11:05 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: krkumar2, xma, kvm, Michael S. Tsirkin, virtualization,
	levinsasha928, netdev, bhutchings
In-Reply-To: <4EDEA0E1.6020501@us.ibm.com>

On 12/07/2011 07:10 AM, Sridhar Samudrala wrote:
> On 12/6/2011 8:14 AM, Michael S. Tsirkin wrote:
>> On Tue, Dec 06, 2011 at 07:42:54AM -0800, Sridhar Samudrala wrote:
>>> On 12/6/2011 5:15 AM, Stefan Hajnoczi wrote:
>>>> On Tue, Dec 6, 2011 at 10:21 AM, Jason Wang<jasowang@redhat.com>   
>>>> wrote:
>>>>> On 12/06/2011 05:18 PM, Stefan Hajnoczi wrote:
>>>>>> On Tue, Dec 6, 2011 at 6:33 AM, Jason 
>>>>>> Wang<jasowang@redhat.com>     wrote:
>>>>>>> On 12/05/2011 06:55 PM, Stefan Hajnoczi wrote:
>>>>>>>> On Mon, Dec 5, 2011 at 8:59 AM, Jason Wang<jasowang@redhat.com>
>>>>>>>>   wrote:
>>>>>> The vcpus are just threads and may not be bound to physical CPUs, so
>>>>>> what is the big picture here?  Is the guest even in the position to
>>>>>> set the best queue mappings today?
>>>>> Not sure it could publish the best mapping but the idea is to make 
>>>>> sure the
>>>>> packets of a flow were handled by the same guest vcpu and may be 
>>>>> the same
>>>>> vhost thread in order to eliminate the packet reordering and lock
>>>>> contention. But this assumption does not take the bouncing of 
>>>>> vhost or vcpu
>>>>> threads which would also affect the result.
>>>> Okay, this is why I'd like to know what the big picture here is.  What
>>>> solution are you proposing?  How are we going to have everything from
>>>> guest application, guest kernel, host threads, and host NIC driver
>>>> play along so we get the right steering up the entire stack.  I think
>>>> there needs to be an answer to that before changing virtio-net to add
>>>> any steering mechanism.
>>>>
>>>>
>>> Yes. Also the current model of  a vhost thread per VM's interface
>>> doesn't help with packet steering
>>> all the way from the guest to the host physical NIC.
>>>
>>> I think we need to have vhost thread(s) per-CPU that can handle
>>> packets to/from physical NIC's
>>> TX/RX queues.
>>> Currently we have a single vhost thread for a VM's i/f
>>> that handles all the packets from
>>> various flows coming from a multi-queue physical NIC.
>>>
>>> Thanks
>>> Sridhar
>> It's not hard to try that:
>> 1. revert c23f3445e68e1db0e74099f264bc5ff5d55ebdeb
>>     this will convert our thread to a workqueue
>> 2. convert the workqueue to a per-cpu one
>>
>> It didn't work that well in the past, but YMMV
> Yes. I tried this before we went ahead with per-interface vhost 
> threading model.
> At that time, per-cpu vhost  showed a regression with a single-VM and
> per-vq vhost showed good performance improvements upto 8 VMs.
>
> So  just making it per-cpu would not be enough. I think we may need a way
> to schedule vcpu threads on the same cpu-socket as vhost.
>
> Another aspect we need to look into is the splitting of vhost thread 
> into separate
> threads for TX and RX. Shirley is doing some work in this area and she 
> is seeing
> perf. improvements as long as TX and RX threads are on the same 
> cpu-socket.

I emulated this through my multi-queue series in the past, looks like it 
damages the performance of single stream especially guest tx.
>>
>> On the surface I'd say a single thread makes some sense
>> as long as guest uses a single queue.
>>
> But this may not be scalable long term when we want to support a large 
> number of VMs each
> having multiple virtio-net interfaces with multiple queues.
>
> Thanks
> Sridhar
>

^ permalink raw reply

* Re: [net-next RFC PATCH 5/5] virtio-net: flow director support
From: Jason Wang @ 2011-12-07 11:02 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: krkumar2, kvm, mst, virtualization, levinsasha928, netdev,
	bhutchings
In-Reply-To: <4EDE37FE.5090409@us.ibm.com>

On 12/06/2011 11:42 PM, Sridhar Samudrala wrote:
> On 12/6/2011 5:15 AM, Stefan Hajnoczi wrote:
>> On Tue, Dec 6, 2011 at 10:21 AM, Jason Wang<jasowang@redhat.com>  wrote:
>>> On 12/06/2011 05:18 PM, Stefan Hajnoczi wrote:
>>>> On Tue, Dec 6, 2011 at 6:33 AM, Jason Wang<jasowang@redhat.com>    
>>>> wrote:
>>>>> On 12/05/2011 06:55 PM, Stefan Hajnoczi wrote:
>>>>>> On Mon, Dec 5, 2011 at 8:59 AM, Jason Wang<jasowang@redhat.com>
>>>>>>   wrote:
>>>> The vcpus are just threads and may not be bound to physical CPUs, so
>>>> what is the big picture here?  Is the guest even in the position to
>>>> set the best queue mappings today?
>>>
>>> Not sure it could publish the best mapping but the idea is to make 
>>> sure the
>>> packets of a flow were handled by the same guest vcpu and may be the 
>>> same
>>> vhost thread in order to eliminate the packet reordering and lock
>>> contention. But this assumption does not take the bouncing of vhost 
>>> or vcpu
>>> threads which would also affect the result.
>> Okay, this is why I'd like to know what the big picture here is.  What
>> solution are you proposing?  How are we going to have everything from
>> guest application, guest kernel, host threads, and host NIC driver
>> play along so we get the right steering up the entire stack.  I think
>> there needs to be an answer to that before changing virtio-net to add
>> any steering mechanism.
>>
>>
> Yes. Also the current model of  a vhost thread per VM's interface 
> doesn't help with packet steering
> all the way from the guest to the host physical NIC.
>
> I think we need to have vhost thread(s) per-CPU that can handle 
> packets to/from physical NIC's
> TX/RX queues. Currently we have a single vhost thread for a VM's i/f 
> that handles all the packets from
> various flows coming from a multi-queue physical NIC.

Even if we have per-cpu workthread, only one socket is used to queue the 
packet then, so a multiple queue(sockets) tap/macvtap is still needed.
>
> Thanks
> Sridhar
>

^ permalink raw reply

* Re: [PATCH v4 04/12] virtio: console: Add freeze and restore handlers to support S4
From: Amit Shah @ 2011-12-07 10:59 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: linux-kernel, levinsasha928, Virtualization List
In-Reply-To: <20111207104323.GE8326@redhat.com>

On (Wed) 07 Dec 2011 [12:43:24], Michael S. Tsirkin wrote:

> > diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> > index e14f5aa..fd2fd6f 100644
> > --- a/drivers/char/virtio_console.c
> > +++ b/drivers/char/virtio_console.c
> > @@ -1844,6 +1844,60 @@ static unsigned int features[] = {
> >  	VIRTIO_CONSOLE_F_MULTIPORT,
> >  };
> >  
> > +#ifdef CONFIG_PM
> > +static int virtcons_freeze(struct virtio_device *vdev)
> > +{
> > +	struct ports_device *portdev;
> > +	struct port *port;
> > +
> > +	portdev = vdev->priv;
> > +
> > +	vdev->config->reset(vdev);
> 
> 
> So here, cancel_work_sync might still be running.
> If it does run, might it try to access the device
> config?  Could not determine this quickly, if yes it's a problem.

Similar to the other comment: I don't see why just resetting device
can cause config queue access to go bad.

		Amit

^ permalink raw reply

* Re: [PATCH v4 06/12] virtio: blk: Add freeze, restore handlers to support S4
From: Amit Shah @ 2011-12-07 10:56 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: linux-kernel, levinsasha928, Virtualization List
In-Reply-To: <20111207103701.GD8326@redhat.com>

On (Wed) 07 Dec 2011 [12:37:02], Michael S. Tsirkin wrote:
> On Wed, Dec 07, 2011 at 01:18:44AM +0530, Amit Shah wrote:
> > Delete the vq and flush any pending requests from the block queue on the
> > freeze callback to prepare for hibernation.
> > 
> > Re-create the vq in the restore callback to resume normal function.
> > 
> > Signed-off-by: Amit Shah <amit.shah@redhat.com>
> > ---
> >  drivers/block/virtio_blk.c |   38 ++++++++++++++++++++++++++++++++++++++
> >  1 files changed, 38 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > index 467f218..a9147a6 100644
> > --- a/drivers/block/virtio_blk.c
> > +++ b/drivers/block/virtio_blk.c
> > @@ -568,6 +568,40 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
> >  	ida_simple_remove(&vd_index_ida, index);
> >  }
> >  
> > +#ifdef CONFIG_PM
> > +static int virtblk_freeze(struct virtio_device *vdev)
> > +{
> > +	struct virtio_blk *vblk = vdev->priv;
> > +
> > +	/* Ensure we don't receive any more interrupts */
> > +	vdev->config->reset(vdev);
> > +
> > +	flush_work(&vblk->config_work);
> 
> It bothers me that config work can be running
> after reset here. If it does, it will not get sane
> values from reading config.

Why so?

The reset only ensures the host doesn't write anything more, isn't it?
Why would the values be affected?

> Also, can there be stuff in the reqs list?
> If yes is this a problem?

Should be all cleared by the two commands below.  At least that's my
expectation.  If not, let me know!

> > +	spin_lock_irq(vblk->disk->queue->queue_lock);
> > +	blk_stop_queue(vblk->disk->queue);
> > +	spin_unlock_irq(vblk->disk->queue->queue_lock);
> > +	blk_sync_queue(vblk->disk->queue);
> > +
> > +	vdev->config->del_vqs(vdev);
> > +	return 0;
> > +}
> > +
> 
> Thinking about it, looks like there's a bug in
> virtblk_remove: if we get a config change after
> flush_work we schedule another work.
> That's a problem for sure as structure is removed.

Yep, it is a potential issue.

		Amit

^ permalink raw reply

* Re: [Xen-devel] [PATCH RFC V3 2/4] kvm hypervisor : Add a hypercall to KVM hypervisor to support pv-ticketlocks
From: Avi Kivity @ 2011-12-07 10:53 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Raghavendra K T, KVM, Peter Zijlstra, Jeremy Fitzhardinge,
	Virtualization, Raghavendra K T, H. Peter Anvin, Dave Hansen, Xen,
	Dave Jiang, x86, Ingo Molnar, Stefano Stabellini,
	Jeremy Fitzhardinge, Sasha Levin, Sedat Dilek, Thomas Gleixner,
	Yinghai Lu, Konrad Rzeszutek Wilk, Greg Kroah-Hartman,
	Srivatsa Vaddagiri, LKML
In-Reply-To: <20111206164947.GA13184@andromeda.dapyr.net>

On 12/06/2011 06:49 PM, Konrad Rzeszutek Wilk wrote:
> On Sun, Dec 04, 2011 at 11:36:58PM +0530, Raghavendra K T wrote:
> > On 12/02/2011 01:20 AM, Raghavendra K T wrote:
> > >>Have you tested it on AMD machines? There are some differences in the
> > >>hypercall infrastructure there.
> > >
> > >Yes. 'll test on AMD machine and update on that.
> > >
> > 
> > I tested the code on 64 bit Dual-Core AMD Opteron machine, and it is 
> > working.
>
> I am not that familiar with how KVM does migration, but do you need any
> special flags in QEMU to when migrating a KVM-pv-spinlock enabled guest
> to another machine?

I don't think so.  Sleeping is an ordinary HLT state which we already
migrate.  There are no further states.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply

* Re: [PATCH v4 00/12] virtio: s4 support
From: Rusty Russell @ 2011-12-07 10:52 UTC (permalink / raw)
  To: Amit Shah
  Cc: linux-kernel, Michael S. Tsirkin, levinsasha928,
	Virtualization List
In-Reply-To: <20111207074456.GD4651@amit-x200.redhat.com>

On Wed, 7 Dec 2011 13:14:56 +0530, Amit Shah <amit.shah@redhat.com> wrote:
> On (Wed) 07 Dec 2011 [17:54:29], Rusty Russell wrote:
> > I figure there's a reason, but it seems a bit weird :)
> 
> Well, there is one reason right now: migrating storage along with
> VMs.  The guest needs to sync all data to the disk before the target
> host accesses the image file.

I don't see why.  Sure, if qemu (or whatever) is doing buffering, it
needs to flush that.  It doesn't, last I looked, it just maps to
read/write.

What am I missing?

Thanks,
Rusty.

^ permalink raw reply

* Re: [PATCH RFC V3 2/4] kvm hypervisor : Add a hypercall to KVM hypervisor to support pv-ticketlocks
From: Marcelo Tosatti @ 2011-12-07 10:48 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: Peter Zijlstra, Virtualization, H. Peter Anvin,
	Stefano Stabellini, Jeremy Fitzhardinge, Dave Jiang, KVM, x86,
	Ingo Molnar, Avi Kivity, Rik van Riel, Konrad Rzeszutek Wilk,
	Srivatsa Vaddagiri, Xen, Sasha Levin, Sedat Dilek,
	Thomas Gleixner, Yinghai Lu, Greg Kroah-Hartman, LKML,
	Dave Hansen, Suzuki Poulose
In-Reply-To: <20111130085959.23386.69166.sendpatchset@oc5400248562.ibm.com>

On Wed, Nov 30, 2011 at 02:29:59PM +0530, Raghavendra K T wrote:
> Add a hypercall to KVM hypervisor to support pv-ticketlocks 
> 
> KVM_HC_KICK_CPU allows the calling vcpu to kick another vcpu out of halt state.
>     
> The presence of these hypercalls is indicated to guest via
> KVM_FEATURE_KICK_VCPU/KVM_CAP_KICK_VCPU.
> 
> Qemu needs a corresponding patch to pass up the presence of this feature to 
> guest via cpuid. Patch to qemu will be sent separately.
> 
> There is no Xen/KVM hypercall interface to await kick from.
>     
> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com>
> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> ---
> diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
> index 734c376..8b1d65d 100644
> --- a/arch/x86/include/asm/kvm_para.h
> +++ b/arch/x86/include/asm/kvm_para.h
> @@ -16,12 +16,14 @@
>  #define KVM_FEATURE_CLOCKSOURCE		0
>  #define KVM_FEATURE_NOP_IO_DELAY	1
>  #define KVM_FEATURE_MMU_OP		2
> +
>  /* This indicates that the new set of kvmclock msrs
>   * are available. The use of 0x11 and 0x12 is deprecated
>   */
>  #define KVM_FEATURE_CLOCKSOURCE2        3
>  #define KVM_FEATURE_ASYNC_PF		4
>  #define KVM_FEATURE_STEAL_TIME		5
> +#define KVM_FEATURE_KICK_VCPU		6
>  
>  /* The last 8 bits are used to indicate how to interpret the flags field
>   * in pvclock structure. If no bits are set, all flags are ignored.
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c38efd7..6e1c8b4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2103,6 +2103,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>  	case KVM_CAP_XSAVE:
>  	case KVM_CAP_ASYNC_PF:
>  	case KVM_CAP_GET_TSC_KHZ:
> +	case KVM_CAP_KICK_VCPU:
>  		r = 1;
>  		break;
>  	case KVM_CAP_COALESCED_MMIO:
> @@ -2577,7 +2578,8 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  			     (1 << KVM_FEATURE_NOP_IO_DELAY) |
>  			     (1 << KVM_FEATURE_CLOCKSOURCE2) |
>  			     (1 << KVM_FEATURE_ASYNC_PF) |
> -			     (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT);
> +			     (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
> +			     (1 << KVM_FEATURE_KICK_VCPU);
>  
>  		if (sched_info_on())
>  			entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
> @@ -5305,6 +5307,26 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>  	return 1;
>  }
>  
> +/*
> + * kvm_pv_kick_cpu_op:  Kick a vcpu.
> + *
> + * @cpu - vcpu to be kicked.
> + */
> +static void kvm_pv_kick_cpu_op(struct kvm *kvm, int cpu)
> +{
> +	struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, cpu);
> +	struct kvm_mp_state mp_state;
> +
> +	mp_state.mp_state = KVM_MP_STATE_RUNNABLE;

Since vcpu->mp_state is not protected by a lock, this is potentially racy. For example:

CPU0						CPU1
kvm_pv_kick_cpu_op				running vcpuN
vcpuN->mp_state = KVM_MP_STATE_RUNNABLE;
						kvm_emulate_halt
						vcpuN->mp_state = KVM_MP_STATE_HALTED

Is it harmless to lose a kick?

^ permalink raw reply

* Re: [PATCH v4 00/12] virtio: s4 support
From: Gleb Natapov @ 2011-12-07 10:44 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Amit Shah, linux-kernel, Michael S. Tsirkin, levinsasha928,
	Virtualization List
In-Reply-To: <87wra8j13m.fsf@rustcorp.com.au>

On Wed, Dec 07, 2011 at 05:54:29PM +1030, Rusty Russell wrote:
> On Wed,  7 Dec 2011 01:18:38 +0530, Amit Shah <amit.shah@redhat.com> wrote:
> > Hi,
> > 
> > These patches add support for S4 to virtio (pci) and all drivers.
> 
> Dumb meta-question: why do we want to hibernate virtual machines?
> 
For instance you want to hibernate your laptop while it is running a virtual
machine. You can pause them of course, but then time will be incorrect
inside a guest after resume.

> I figure there's a reason, but it seems a bit weird :)
> 
> Thanks,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
			Gleb.

^ permalink raw reply

* Re: [PATCH v4 04/12] virtio: console: Add freeze and restore handlers to support S4
From: Michael S. Tsirkin @ 2011-12-07 10:43 UTC (permalink / raw)
  To: Amit Shah; +Cc: linux-kernel, levinsasha928, Virtualization List
In-Reply-To: <9f09629340556ddac37135bc3aab233f1154f86a.1323199985.git.amit.shah@redhat.com>

On Wed, Dec 07, 2011 at 01:18:42AM +0530, Amit Shah wrote:
> Remove all vqs and associated buffers in the freeze callback which
> prepares us to go into hibernation state.  On restore, re-create all the
> vqs and populate the input vqs with buffers to get to the pre-hibernate
> state.
> 
> Note: Any outstanding unconsumed buffers are discarded; which means
> there's a possibility of data loss in case the host or the guest didn't
> consume any data already present in the vqs.  This can be addressed in a
> later patch series, perhaps in virtio common code.
> 
> Signed-off-by: Amit Shah <amit.shah@redhat.com>
> ---
>  drivers/char/virtio_console.c |   58 +++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 58 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> index e14f5aa..fd2fd6f 100644
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -1844,6 +1844,60 @@ static unsigned int features[] = {
>  	VIRTIO_CONSOLE_F_MULTIPORT,
>  };
>  
> +#ifdef CONFIG_PM
> +static int virtcons_freeze(struct virtio_device *vdev)
> +{
> +	struct ports_device *portdev;
> +	struct port *port;
> +
> +	portdev = vdev->priv;
> +
> +	vdev->config->reset(vdev);


So here, cancel_work_sync might still be running.
If it does run, might it try to access the device
config?  Could not determine this quickly, if yes it's a problem.

> +
> +	cancel_work_sync(&portdev->control_work);
> +	remove_controlq_data(portdev);
> +
> +	list_for_each_entry(port, &portdev->ports, list) {
> +		/*
> +		 * We'll ask the host later if the new invocation has
> +		 * the port opened or closed.
> +		 */
> +		port->host_connected = false;
> +		remove_port_data(port);
> +	}
> +	remove_vqs(portdev);
> +
> +	return 0;
> +}
> +
> +static int virtcons_restore(struct virtio_device *vdev)
> +{
> +	struct ports_device *portdev;
> +	struct port *port;
> +	int ret;
> +
> +	portdev = vdev->priv;
> +
> +	ret = init_vqs(portdev);
> +	if (ret)
> +		return ret;
> +
> +	if (use_multiport(portdev))
> +		fill_queue(portdev->c_ivq, &portdev->cvq_lock);
> +
> +	list_for_each_entry(port, &portdev->ports, list) {
> +		port->in_vq = portdev->in_vqs[port->id];
> +		port->out_vq = portdev->out_vqs[port->id];
> +
> +		fill_queue(port->in_vq, &port->inbuf_lock);
> +
> +		/* Get port open/close status on the host */
> +		send_control_msg(port, VIRTIO_CONSOLE_PORT_READY, 1);
> +	}
> +	return 0;
> +}
> +#endif
> +
>  static struct virtio_driver virtio_console = {
>  	.feature_table = features,
>  	.feature_table_size = ARRAY_SIZE(features),
> @@ -1853,6 +1907,10 @@ static struct virtio_driver virtio_console = {
>  	.probe =	virtcons_probe,
>  	.remove =	virtcons_remove,
>  	.config_changed = config_intr,
> +#ifdef CONFIG_PM
> +	.freeze =	virtcons_freeze,
> +	.restore =	virtcons_restore,
> +#endif
>  };
>  
>  static int __init init(void)
> -- 
> 1.7.7.3

^ permalink raw reply

* Re: (resend) [PATCH v4 12/12] virtio: balloon: Add freeze, restore handlers to support S4
From: Amit Shah @ 2011-12-07 10:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Len Brown, linux-pm, linux-kernel, Virtualization List,
	Rafael J. Wysocki, levinsasha928, Pavel Machek
In-Reply-To: <20111207103448.GA20882@redhat.com>

On (Wed) 07 Dec 2011 [12:34:48], Michael S. Tsirkin wrote:
> On Wed, Dec 07, 2011 at 01:18:50AM +0530, Amit Shah wrote:
> > Now to not race with a host issuing ballooning requests while we are in
> > the process of freezing, we just exit from the vballoon kthread when the
> > processes are asked to freeze.  Upon thaw and restore, we re-start the
> > thread.
> 
> ...
> 
> > ---
> >  drivers/virtio/virtio_balloon.c |   79 ++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 78 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index 8bf99be..10ec638 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -258,7 +258,13 @@ static int balloon(void *_vballoon)
> >  	while (!kthread_should_stop()) {
> >  		s64 diff;
> >  
> > -		try_to_freeze();
> > +		/*
> > +		 * On suspend, we want to exit this thread.  We will
> > +		 * start a new thread on resume.
> > +		 */
> > +		if (freezing(current))
> > +			break;
> > +
> >  		wait_event_interruptible(vb->config_change,
> >  					 (diff = towards_target(vb)) != 0
> >  					 || vb->need_stats_update
> 
> ...
> 
> Note: this relies on kthreads being frozen before devices.
> Looking at kernel/power/hibernate.c this is the case,
> but I think we should add a comment to note this.

Yes, it does.  And for that reason, I mentioned that stopping the
thread doesn't buy us anything; I'll revert this change in the next
submission.

> Also Cc linux-pm crowd in case I got it wrong.

		Amit

^ permalink raw reply

* Re: [PATCH v4 06/12] virtio: blk: Add freeze, restore handlers to support S4
From: Michael S. Tsirkin @ 2011-12-07 10:37 UTC (permalink / raw)
  To: Amit Shah; +Cc: linux-kernel, levinsasha928, Virtualization List
In-Reply-To: <e66570ecae66535fbb5c060f6572b94e8c9db85e.1323199985.git.amit.shah@redhat.com>

On Wed, Dec 07, 2011 at 01:18:44AM +0530, Amit Shah wrote:
> Delete the vq and flush any pending requests from the block queue on the
> freeze callback to prepare for hibernation.
> 
> Re-create the vq in the restore callback to resume normal function.
> 
> Signed-off-by: Amit Shah <amit.shah@redhat.com>
> ---
>  drivers/block/virtio_blk.c |   38 ++++++++++++++++++++++++++++++++++++++
>  1 files changed, 38 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 467f218..a9147a6 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -568,6 +568,40 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
>  	ida_simple_remove(&vd_index_ida, index);
>  }
>  
> +#ifdef CONFIG_PM
> +static int virtblk_freeze(struct virtio_device *vdev)
> +{
> +	struct virtio_blk *vblk = vdev->priv;
> +
> +	/* Ensure we don't receive any more interrupts */
> +	vdev->config->reset(vdev);
> +
> +	flush_work(&vblk->config_work);

It bothers me that config work can be running
after reset here. If it does, it will not get sane
values from reading config.


Also, can there be stuff in the reqs list?
If yes is this a problem?

> +
> +	spin_lock_irq(vblk->disk->queue->queue_lock);
> +	blk_stop_queue(vblk->disk->queue);
> +	spin_unlock_irq(vblk->disk->queue->queue_lock);
> +	blk_sync_queue(vblk->disk->queue);
> +
> +	vdev->config->del_vqs(vdev);
> +	return 0;
> +}
> +

Thinking about it, looks like there's a bug in
virtblk_remove: if we get a config change after
flush_work we schedule another work.
That's a problem for sure as structure is removed.


> +static int virtblk_restore(struct virtio_device *vdev)
> +{
> +	struct virtio_blk *vblk = vdev->priv;
> +	int ret;
> +
> +	ret = init_vq(vdev->priv);
> +	if (!ret) {
> +		spin_lock_irq(vblk->disk->queue->queue_lock);
> +		blk_start_queue(vblk->disk->queue);
> +		spin_unlock_irq(vblk->disk->queue->queue_lock);
> +	}
> +	return ret;
> +}
> +#endif
> +
>  static const struct virtio_device_id id_table[] = {
>  	{ VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID },
>  	{ 0 },
> @@ -593,6 +627,10 @@ static struct virtio_driver __refdata virtio_blk = {
>  	.probe			= virtblk_probe,
>  	.remove			= __devexit_p(virtblk_remove),
>  	.config_changed		= virtblk_config_changed,
> +#ifdef CONFIG_PM
> +	.freeze			= virtblk_freeze,
> +	.restore		= virtblk_restore,
> +#endif
>  };
>  
>  static int __init init(void)
> -- 
> 1.7.7.3

^ permalink raw reply

* Re: [PATCH v4 09/12] virtio: net: Add freeze, restore handlers to support S4
From: Amit Shah @ 2011-12-07 10:35 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: linux-kernel, levinsasha928, Virtualization List
In-Reply-To: <20111207102824.GC8326@redhat.com>

On (Wed) 07 Dec 2011 [12:28:24], Michael S. Tsirkin wrote:
> On Wed, Dec 07, 2011 at 01:18:47AM +0530, Amit Shah wrote:
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 697a0fc..1378f3c 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1151,6 +1151,38 @@ static void __devexit virtnet_remove(struct virtio_device *vdev)
> >  	free_netdev(vi->dev);
> >  }
> >  
> > +#ifdef CONFIG_PM
> > +static int virtnet_freeze(struct virtio_device *vdev)
> > +{
> > +	struct virtnet_info *vi = vdev->priv;
> > +
> > +	netif_device_detach(vi->dev);
> > +	if (netif_running(vi->dev))
> > +		napi_disable(&vi->napi);
> > +
> 
> Could refill_work still be running at this point?

Yes, it could.  So moving the cancel_delayed_work_sync() before
disabling napi would work fine?  Anything else that might similar treatment?

		Amit

^ permalink raw reply

* Re: (resend) [PATCH v4 12/12] virtio: balloon: Add freeze, restore handlers to support S4
From: Michael S. Tsirkin @ 2011-12-07 10:34 UTC (permalink / raw)
  To: Amit Shah
  Cc: Len Brown, linux-pm, linux-kernel, Virtualization List,
	Rafael J. Wysocki, levinsasha928, Pavel Machek
In-Reply-To: <5deccc36afa59032f0e3b10a653773bad511f303.1323199985.git.amit.shah@redhat.com>

On Wed, Dec 07, 2011 at 01:18:50AM +0530, Amit Shah wrote:
> Now to not race with a host issuing ballooning requests while we are in
> the process of freezing, we just exit from the vballoon kthread when the
> processes are asked to freeze.  Upon thaw and restore, we re-start the
> thread.

...

> ---
>  drivers/virtio/virtio_balloon.c |   79 ++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 78 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 8bf99be..10ec638 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -258,7 +258,13 @@ static int balloon(void *_vballoon)
>  	while (!kthread_should_stop()) {
>  		s64 diff;
>  
> -		try_to_freeze();
> +		/*
> +		 * On suspend, we want to exit this thread.  We will
> +		 * start a new thread on resume.
> +		 */
> +		if (freezing(current))
> +			break;
> +
>  		wait_event_interruptible(vb->config_change,
>  					 (diff = towards_target(vb)) != 0
>  					 || vb->need_stats_update

...

Note: this relies on kthreads being frozen before devices.
Looking at kernel/power/hibernate.c this is the case,
but I think we should add a comment to note this.

Also Cc linux-pm crowd in case I got it wrong.


---

Resending due to corrupted headers. Sorry about the noise.

-- 
MST

^ permalink raw reply

* Re: [PATCH v4 09/12] virtio: net: Add freeze, restore handlers to support S4
From: Michael S. Tsirkin @ 2011-12-07 10:28 UTC (permalink / raw)
  To: Amit Shah; +Cc: linux-kernel, levinsasha928, Virtualization List
In-Reply-To: <712d8d78cf730161181562ed44d4fe08a370b85e.1323199985.git.amit.shah@redhat.com>

On Wed, Dec 07, 2011 at 01:18:47AM +0530, Amit Shah wrote:
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 697a0fc..1378f3c 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1151,6 +1151,38 @@ static void __devexit virtnet_remove(struct virtio_device *vdev)
>  	free_netdev(vi->dev);
>  }
>  
> +#ifdef CONFIG_PM
> +static int virtnet_freeze(struct virtio_device *vdev)
> +{
> +	struct virtnet_info *vi = vdev->priv;
> +
> +	netif_device_detach(vi->dev);
> +	if (netif_running(vi->dev))
> +		napi_disable(&vi->napi);
> +

Could refill_work still be running at this point?
If yes it can re-enable napi and cause other kind of trouble.

> +	remove_vq_common(vi);
> +
> +	return 0;
> +}

^ permalink raw reply

* (unknown)
From: Michael S. Tsirkin @ 2011-12-07 10:25 UTC (permalink / raw)
  To: Amit Shah; +Cc: linux-kernel, levinsasha928, Virtualization List

Pavel Machek <pavel@ucw.cz>,
"Rafael J. Wysocki" <rjw@sisk.pl>,
Len Brown <len.brown@intel.com>,
linux-pm@vger.kernel.org
Bcc: 
Subject: Re: [PATCH v4 12/12] virtio: balloon: Add freeze, restore handlers
 to support S4
Reply-To: 
In-Reply-To: <5deccc36afa59032f0e3b10a653773bad511f303.1323199985.git.amit.shah@redhat.com>

On Wed, Dec 07, 2011 at 01:18:50AM +0530, Amit Shah wrote:
> Now to not race with a host issuing ballooning requests while we are in
> the process of freezing, we just exit from the vballoon kthread when the
> processes are asked to freeze.  Upon thaw and restore, we re-start the
> thread.

...

> ---
>  drivers/virtio/virtio_balloon.c |   79 ++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 78 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 8bf99be..10ec638 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -258,7 +258,13 @@ static int balloon(void *_vballoon)
>  	while (!kthread_should_stop()) {
>  		s64 diff;
>  
> -		try_to_freeze();
> +		/*
> +		 * On suspend, we want to exit this thread.  We will
> +		 * start a new thread on resume.
> +		 */
> +		if (freezing(current))
> +			break;
> +
>  		wait_event_interruptible(vb->config_change,
>  					 (diff = towards_target(vb)) != 0
>  					 || vb->need_stats_update

...

Note: this relies on kthreads being frozen before devices.
Looking at kernel/power/hibernate.c this is the case,
but I think we should add a comment to note this.

Also Cc linux-pm crowd in case I got it wrong.

-- 
MST

^ permalink raw reply

* Re: [PATCH v4 01/12] virtio: pci: switch to new PM API
From: Rafael J. Wysocki @ 2011-12-07 10:16 UTC (permalink / raw)
  To: Amit Shah
  Cc: linux-kernel, Michael S. Tsirkin, levinsasha928,
	Virtualization List
In-Reply-To: <20111207095229.GF4651@amit-x200.redhat.com>

On Wednesday, December 07, 2011, Amit Shah wrote:
> On (Wed) 07 Dec 2011 [10:48:24], Rafael J. Wysocki wrote:
> > On Wednesday, December 07, 2011, Amit Shah wrote:
> > > Hi Rafael,
> > > 
> > > On (Tue) 06 Dec 2011 [23:12:36], Rafael J. Wysocki wrote:
> > > > Hi,
> > > > 
> > > > On Tuesday, December 06, 2011, Amit Shah wrote:
> > > > > The older PM API doesn't have a way to get notifications on hibernate
> > > > > events.  Switch to the newer one that gives us those notifications.
> > > > > 
> > > > > Signed-off-by: Amit Shah <amit.shah@redhat.com>
> > > > > ---
> > > > >  drivers/virtio/virtio_pci.c |   16 ++++++++++++----
> > > > >  1 files changed, 12 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
> > > > > index 03d1984..23e1532 100644
> > > > > --- a/drivers/virtio/virtio_pci.c
> > > > > +++ b/drivers/virtio/virtio_pci.c
> > > > > @@ -708,19 +708,28 @@ static void __devexit virtio_pci_remove(struct pci_dev *pci_dev)
> > > > >  }
> > > > >  
> > > > >  #ifdef CONFIG_PM
> > > > > -static int virtio_pci_suspend(struct pci_dev *pci_dev, pm_message_t state)
> > > > > +static int virtio_pci_suspend(struct device *dev)
> > > > >  {
> > > > > +	struct pci_dev *pci_dev = to_pci_dev(dev);
> > > > > +
> > > > >  	pci_save_state(pci_dev);
> > > > >  	pci_set_power_state(pci_dev, PCI_D3hot);
> > > > >  	return 0;
> > > > >  }
> > > > >  
> > > > > -static int virtio_pci_resume(struct pci_dev *pci_dev)
> > > > > +static int virtio_pci_resume(struct device *dev)
> > > > >  {
> > > > > +	struct pci_dev *pci_dev = to_pci_dev(dev);
> > > > > +
> > > > >  	pci_restore_state(pci_dev);
> > > > >  	pci_set_power_state(pci_dev, PCI_D0);
> > > > >  	return 0;
> > > > >  }
> > > > > +
> > > > > +static const struct dev_pm_ops virtio_pci_pm_ops = {
> > > > > +	.suspend = virtio_pci_suspend,
> > > > > +	.resume  = virtio_pci_resume,
> > > > > +};
> > > > >  #endif
> > > > 
> > > > You seem to have forgotten about hibernation callbacks.
> > > 
> > > This patch just moves to the new API keeping everything else the same.
> > > The hibernation callbacks come in patch 2.
> > 
> > OK, but then hibernation will be broken between the two patches, so perhaps
> > it's better to merge them into one?
> 
> The hibernation support doesn't exist now.  It's added in the later
> patches of the series (3 onwards).

OK then, sorry for the noise.

Rafael

^ permalink raw reply

* Re: [PATCH v4 01/12] virtio: pci: switch to new PM API
From: Amit Shah @ 2011-12-07  9:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, Michael S. Tsirkin, levinsasha928,
	Virtualization List
In-Reply-To: <201112071048.24566.rjw@sisk.pl>

On (Wed) 07 Dec 2011 [10:48:24], Rafael J. Wysocki wrote:
> On Wednesday, December 07, 2011, Amit Shah wrote:
> > Hi Rafael,
> > 
> > On (Tue) 06 Dec 2011 [23:12:36], Rafael J. Wysocki wrote:
> > > Hi,
> > > 
> > > On Tuesday, December 06, 2011, Amit Shah wrote:
> > > > The older PM API doesn't have a way to get notifications on hibernate
> > > > events.  Switch to the newer one that gives us those notifications.
> > > > 
> > > > Signed-off-by: Amit Shah <amit.shah@redhat.com>
> > > > ---
> > > >  drivers/virtio/virtio_pci.c |   16 ++++++++++++----
> > > >  1 files changed, 12 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
> > > > index 03d1984..23e1532 100644
> > > > --- a/drivers/virtio/virtio_pci.c
> > > > +++ b/drivers/virtio/virtio_pci.c
> > > > @@ -708,19 +708,28 @@ static void __devexit virtio_pci_remove(struct pci_dev *pci_dev)
> > > >  }
> > > >  
> > > >  #ifdef CONFIG_PM
> > > > -static int virtio_pci_suspend(struct pci_dev *pci_dev, pm_message_t state)
> > > > +static int virtio_pci_suspend(struct device *dev)
> > > >  {
> > > > +	struct pci_dev *pci_dev = to_pci_dev(dev);
> > > > +
> > > >  	pci_save_state(pci_dev);
> > > >  	pci_set_power_state(pci_dev, PCI_D3hot);
> > > >  	return 0;
> > > >  }
> > > >  
> > > > -static int virtio_pci_resume(struct pci_dev *pci_dev)
> > > > +static int virtio_pci_resume(struct device *dev)
> > > >  {
> > > > +	struct pci_dev *pci_dev = to_pci_dev(dev);
> > > > +
> > > >  	pci_restore_state(pci_dev);
> > > >  	pci_set_power_state(pci_dev, PCI_D0);
> > > >  	return 0;
> > > >  }
> > > > +
> > > > +static const struct dev_pm_ops virtio_pci_pm_ops = {
> > > > +	.suspend = virtio_pci_suspend,
> > > > +	.resume  = virtio_pci_resume,
> > > > +};
> > > >  #endif
> > > 
> > > You seem to have forgotten about hibernation callbacks.
> > 
> > This patch just moves to the new API keeping everything else the same.
> > The hibernation callbacks come in patch 2.
> 
> OK, but then hibernation will be broken between the two patches, so perhaps
> it's better to merge them into one?

The hibernation support doesn't exist now.  It's added in the later
patches of the series (3 onwards).

		Amit

^ permalink raw reply

* Re: [PATCH v4 01/12] virtio: pci: switch to new PM API
From: Rafael J. Wysocki @ 2011-12-07  9:48 UTC (permalink / raw)
  To: Amit Shah
  Cc: linux-kernel, Michael S. Tsirkin, levinsasha928,
	Virtualization List
In-Reply-To: <20111207035703.GA4651@amit-x200.redhat.com>

On Wednesday, December 07, 2011, Amit Shah wrote:
> Hi Rafael,
> 
> On (Tue) 06 Dec 2011 [23:12:36], Rafael J. Wysocki wrote:
> > Hi,
> > 
> > On Tuesday, December 06, 2011, Amit Shah wrote:
> > > The older PM API doesn't have a way to get notifications on hibernate
> > > events.  Switch to the newer one that gives us those notifications.
> > > 
> > > Signed-off-by: Amit Shah <amit.shah@redhat.com>
> > > ---
> > >  drivers/virtio/virtio_pci.c |   16 ++++++++++++----
> > >  1 files changed, 12 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
> > > index 03d1984..23e1532 100644
> > > --- a/drivers/virtio/virtio_pci.c
> > > +++ b/drivers/virtio/virtio_pci.c
> > > @@ -708,19 +708,28 @@ static void __devexit virtio_pci_remove(struct pci_dev *pci_dev)
> > >  }
> > >  
> > >  #ifdef CONFIG_PM
> > > -static int virtio_pci_suspend(struct pci_dev *pci_dev, pm_message_t state)
> > > +static int virtio_pci_suspend(struct device *dev)
> > >  {
> > > +	struct pci_dev *pci_dev = to_pci_dev(dev);
> > > +
> > >  	pci_save_state(pci_dev);
> > >  	pci_set_power_state(pci_dev, PCI_D3hot);
> > >  	return 0;
> > >  }
> > >  
> > > -static int virtio_pci_resume(struct pci_dev *pci_dev)
> > > +static int virtio_pci_resume(struct device *dev)
> > >  {
> > > +	struct pci_dev *pci_dev = to_pci_dev(dev);
> > > +
> > >  	pci_restore_state(pci_dev);
> > >  	pci_set_power_state(pci_dev, PCI_D0);
> > >  	return 0;
> > >  }
> > > +
> > > +static const struct dev_pm_ops virtio_pci_pm_ops = {
> > > +	.suspend = virtio_pci_suspend,
> > > +	.resume  = virtio_pci_resume,
> > > +};
> > >  #endif
> > 
> > You seem to have forgotten about hibernation callbacks.
> 
> This patch just moves to the new API keeping everything else the same.
> The hibernation callbacks come in patch 2.

OK, but then hibernation will be broken between the two patches, so perhaps
it's better to merge them into one?

> >  Please use
> > one the macros defined in include/linux/pm.h if you want to use the same
> > callback routines for hibernation.
> 
> No, they're different functions, so I don't use the maros.

OK

Thanks,
Rafael

^ permalink raw reply

* Re: [net-next RFC PATCH 5/5] virtio-net: flow director support
From: Stefan Hajnoczi @ 2011-12-07  9:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: krkumar2, kvm, mst, netdev, virtualization, levinsasha928,
	bhutchings
In-Reply-To: <4EDED785.80804@redhat.com>

On Wed, Dec 7, 2011 at 3:03 AM, Jason Wang <jasowang@redhat.com> wrote:
> On 12/06/2011 09:15 PM, Stefan Hajnoczi wrote:
>>
>> On Tue, Dec 6, 2011 at 10:21 AM, Jason Wang<jasowang@redhat.com>  wrote:
>>>
>>> On 12/06/2011 05:18 PM, Stefan Hajnoczi wrote:
>>>>
>>>> On Tue, Dec 6, 2011 at 6:33 AM, Jason Wang<jasowang@redhat.com>
>>>>  wrote:
>>>>>
>>>>> On 12/05/2011 06:55 PM, Stefan Hajnoczi wrote:
>>>>>>
>>>>>> On Mon, Dec 5, 2011 at 8:59 AM, Jason Wang<jasowang@redhat.com>
>>>>>>  wrote:
>>>>
>>>> The vcpus are just threads and may not be bound to physical CPUs, so
>>>> what is the big picture here?  Is the guest even in the position to
>>>> set the best queue mappings today?
>>>
>>>
>>> Not sure it could publish the best mapping but the idea is to make sure
>>> the
>>> packets of a flow were handled by the same guest vcpu and may be the same
>>> vhost thread in order to eliminate the packet reordering and lock
>>> contention. But this assumption does not take the bouncing of vhost or
>>> vcpu
>>> threads which would also affect the result.
>>
>> Okay, this is why I'd like to know what the big picture here is.  What
>> solution are you proposing?  How are we going to have everything from
>> guest application, guest kernel, host threads, and host NIC driver
>> play along so we get the right steering up the entire stack.  I think
>> there needs to be an answer to that before changing virtio-net to add
>> any steering mechanism.
>
>
> Consider the complexity of the host nic each with their own steering
> features,  this series make the first step with minimal effort to try to let
> guest driver and host tap/macvtap co-operate like what physical nic does.
> There may be other method, but performance numbers is also needed to give
> the answer.

I agree that performance results for this need to be shown.

My original point is really that it's not a good idea to take
individual steps without a good big picture because this will change
the virtio-net device specification.  If this turns out to be a dead
end then hosts will need to continue to support the interface forever
(legacy guests could still try to use it).  So please first explain
what the full stack picture is going to look like and how you think it
will lead to better performance.  You don't need to have all the code
or evidence, but just enough explanation so we see where this is all
going.

Stefan

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox