Linux virtualization list
 help / color / mirror / Atom feed
* [PATCH 3/5] virtio: rng: don't wait on host when module is going away
From: Amit Shah @ 2012-05-28  6:48 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Amit Shah, Virtualization List
In-Reply-To: <cover.1338187342.git.amit.shah@redhat.com>

No use waiting for input from host when the module is being removed.
We're going to remove the vq in the next step anyway, so just perform
any other steps for cleanup (currently none).

Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
 drivers/char/hw_random/virtio-rng.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index c8a9350..2dc9ce1 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -109,6 +109,7 @@ static int virtrng_probe(struct virtio_device *vdev)
 static void __devexit virtrng_remove(struct virtio_device *vdev)
 {
 	vdev->config->reset(vdev);
+	busy = false;
 	hwrng_unregister(&virtio_hwrng);
 	vdev->config->del_vqs(vdev);
 }
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH 4/5] virtio: rng: split out common code in probe / remove for s3/s4 ops
From: Amit Shah @ 2012-05-28  6:48 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Amit Shah, Virtualization List
In-Reply-To: <cover.1338187342.git.amit.shah@redhat.com>

The freeze/restore s3/s4 operations will use code that's common to the
probe and remove routines.  Put the common code in separate funcitons.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
 drivers/char/hw_random/virtio-rng.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index 2dc9ce1..a9673a7 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -88,7 +88,7 @@ static struct hwrng virtio_hwrng = {
 	.read		= virtio_read,
 };
 
-static int virtrng_probe(struct virtio_device *vdev)
+static int probe_common(struct virtio_device *vdev)
 {
 	int err;
 
@@ -106,7 +106,7 @@ static int virtrng_probe(struct virtio_device *vdev)
 	return 0;
 }
 
-static void __devexit virtrng_remove(struct virtio_device *vdev)
+static void remove_common(struct virtio_device *vdev)
 {
 	vdev->config->reset(vdev);
 	busy = false;
@@ -114,6 +114,16 @@ static void __devexit virtrng_remove(struct virtio_device *vdev)
 	vdev->config->del_vqs(vdev);
 }
 
+static int virtrng_probe(struct virtio_device *vdev)
+{
+	return probe_common(vdev);
+}
+
+static void __devexit virtrng_remove(struct virtio_device *vdev)
+{
+	remove_common(vdev);
+}
+
 static struct virtio_device_id id_table[] = {
 	{ VIRTIO_ID_RNG, VIRTIO_DEV_ANY_ID },
 	{ 0 },
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH 5/5] virtio: rng: s3/s4 support
From: Amit Shah @ 2012-05-28  6:48 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Amit Shah, Virtualization List
In-Reply-To: <cover.1338187342.git.amit.shah@redhat.com>

Unregister from the hwrng interface and remove the vq before entering
the S3 or S4 states.  Add the vq and re-register with hwrng on restore.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
 drivers/char/hw_random/virtio-rng.c |   17 +++++++++++++++++
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index a9673a7..5708299 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -124,6 +124,19 @@ static void __devexit virtrng_remove(struct virtio_device *vdev)
 	remove_common(vdev);
 }
 
+#ifdef CONFIG_PM
+static int virtrng_freeze(struct virtio_device *vdev)
+{
+	remove_common(vdev);
+	return 0;
+}
+
+static int virtrng_restore(struct virtio_device *vdev)
+{
+	return probe_common(vdev);
+}
+#endif
+
 static struct virtio_device_id id_table[] = {
 	{ VIRTIO_ID_RNG, VIRTIO_DEV_ANY_ID },
 	{ 0 },
@@ -135,6 +148,10 @@ static struct virtio_driver virtio_rng_driver = {
 	.id_table =	id_table,
 	.probe =	virtrng_probe,
 	.remove =	__devexit_p(virtrng_remove),
+#ifdef CONFIG_PM
+	.freeze =	virtrng_freeze,
+	.restore =	virtrng_restore,
+#endif
 };
 
 static int __init init(void)
-- 
1.7.7.6

^ permalink raw reply related

* Re: [PATCH 0/5] virtio: rng: fixes
From: Rusty Russell @ 2012-05-28  8:24 UTC (permalink / raw)
  Cc: Amit Shah, Virtualization List
In-Reply-To: <cover.1338187342.git.amit.shah@redhat.com>

On Mon, 28 May 2012 12:18:38 +0530, Amit Shah <amit.shah@redhat.com> wrote:
> Hi Rusty,
> 
> These are a few fixes for the virtio-rng driver.  These were tested
> using the not-yet-upstream virtio-rng device patch to qemu:
> 
> http://thread.gmane.org/gmane.comp.emulators.qemu/152668
> 
> Please apply.

Thanks, applied.

Cheers,
Rusty.

^ permalink raw reply

* Re: [PATCH RFC] virtio-net: remove useless disable on freeze
From: Michael S. Tsirkin @ 2012-05-28 12:53 UTC (permalink / raw)
  To: netdev; +Cc: Amit Shah, linux-kernel, kvm, virtualization
In-Reply-To: <20120404091954.GA3776@redhat.com>

On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
> disable_cb is just an optimization: it
> can not guarantee that there are no callbacks.
> 
> I didn't yet figure out whether a callback
> in freeze will trigger a bug, but disable_cb
> won't address it in any case. So let's remove
> the useless calls as a first step.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Looks like this isn't in the 3.5 pull request -
just lost in the shuffle?
disable_cb is advisory so can't be relied upon.

> ---
>  drivers/net/virtio_net.c |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 019da01..971931e5 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1182,11 +1182,6 @@ static int virtnet_freeze(struct virtio_device *vdev)
>  {
>  	struct virtnet_info *vi = vdev->priv;
>  
> -	virtqueue_disable_cb(vi->rvq);
> -	virtqueue_disable_cb(vi->svq);
> -	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ))
> -		virtqueue_disable_cb(vi->cvq);
> -
>  	netif_device_detach(vi->dev);
>  	cancel_delayed_work_sync(&vi->refill);
>  
> -- 
> 1.7.9.111.gf3fb0

^ permalink raw reply

* Re: [PATCH RFC] virtio-net: remove useless disable on freeze
From: Rusty Russell @ 2012-05-30 10:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, netdev; +Cc: Amit Shah, linux-kernel, kvm, virtualization
In-Reply-To: <20120528125325.GA22576@redhat.com>

On Mon, 28 May 2012 15:53:25 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
> > disable_cb is just an optimization: it
> > can not guarantee that there are no callbacks.
> > 
> > I didn't yet figure out whether a callback
> > in freeze will trigger a bug, but disable_cb
> > won't address it in any case. So let's remove
> > the useless calls as a first step.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Looks like this isn't in the 3.5 pull request -
> just lost in the shuffle?
> disable_cb is advisory so can't be relied upon.

I always (try to?) reply as I accept patches.

This one did slip by, but it's harmless so no need to push AFAICT.

Applied.

Thanks!
Rusty.

^ permalink raw reply

* Re: [PATCH RFC V8 0/17] Paravirtualized ticket spinlocks
From: Raghavendra K T @ 2012-05-30 11:26 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jeremy Fitzhardinge, Greg Kroah-Hartman, KVM, linux-doc,
	Srivatsa Vaddagiri, Andi Kleen, H. Peter Anvin, Ingo Molnar,
	Stefano Stabellini, Xen Devel, X86, Ingo Molnar, Peter Zijlstra,
	Nikunj A. Dadhania, Konrad Rzeszutek Wilk, Thomas Gleixner,
	Virtualization, LKML, Attilio Rao, Andrew Morton, Linus Torvalds,
	Stephan Diestelhorst
In-Reply-To: <4FB31CA4.5070908@linux.vnet.ibm.com>

On 05/16/2012 08:49 AM, Raghavendra K T wrote:
> On 05/14/2012 12:15 AM, Raghavendra K T wrote:
>> On 05/07/2012 08:22 PM, Avi Kivity wrote:
>>
>> I could not come with pv-flush results (also Nikunj had clarified that
>> the result was on NOn PLE
>>
>>> I'd like to see those numbers, then.
>>>
>>> Ingo, please hold on the kvm-specific patches, meanwhile.
[...]
> To summarise,
> with 32 vcpu guest with nr thread=32 we get around 27% improvement. In
> very low/undercommitted systems we may see very small improvement or
> small acceptable degradation ( which it deserves).
>

For large guests, current value SPIN_THRESHOLD, along with ple_window 
needed some of research/experiment.

[Thanks to Jeremy/Nikunj for inputs and help in result analysis ]

I started with debugfs spinlock/histograms, and ran experiments with 32, 
64 vcpu guests for spin threshold of 2k, 4k, 8k, 16k, and 32k with
1vm/2vm/4vm  for kernbench, sysbench, ebizzy, hackbench.
[ spinlock/histogram  gives logarithmic view of lockwait times ]

machine: PLE machine  with 32 cores.

Here is the result summary.
The summary includes 2 part,
(1) %improvement w.r.t 2K spin threshold,
(2) improvement w.r.t sum of histogram numbers in debugfs (that gives 
rough indication of contention/cpu time wasted)

  For e.g 98% for 4k threshold kbench 1 vm would imply, there is a 98% 
reduction in sigma(histogram values) compared to 2k case

Result for 32 vcpu guest
==========================
+----------------+-----------+-----------+-----------+-----------+
|    Base-2k     |     4k    |    8k     |   16k     |    32k    |
+----------------+-----------+-----------+-----------+-----------+
|     kbench-1vm |       44  |       50  |       46  |       41  |
|  SPINHisto-1vm |       98  |       99  |       99  |       99  |
|     kbench-2vm |       25  |       45  |       49  |       45  |
|  SPINHisto-2vm |       31  |       91  |       99  |       99  |
|     kbench-4vm |      -13  |      -27  |       -2  |       -4  |
|  SPINHisto-4vm |       29  |       66  |       95  |       99  |
+----------------+-----------+-----------+-----------+-----------+
|     ebizzy-1vm |      954  |      942  |      913  |      915  |
|  SPINHisto-1vm |       96  |       99  |       99  |       99  |
|     ebizzy-2vm |      158  |      135  |      123  |      106  |
|  SPINHisto-2vm |       90  |       98  |       99  |       99  |
|     ebizzy-4vm |      -13  |      -28  |      -33  |      -37  |
|  SPINHisto-4vm |       83  |       98  |       99  |       99  |
+----------------+-----------+-----------+-----------+-----------+
|     hbench-1vm |       48  |       56  |       52  |       64  |
|  SPINHisto-1vm |       92  |       95  |       99  |       99  |
|     hbench-2vm |       32  |       40  |       39  |       21  |
|  SPINHisto-2vm |       74  |       96  |       99  |       99  |
|     hbench-4vm |       27  |       15  |        3  |      -57  |
|  SPINHisto-4vm |       68  |       88  |       94  |       97  |
+----------------+-----------+-----------+-----------+-----------+
|    sysbnch-1vm |        0  |        0  |        1  |        0  |
|  SPINHisto-1vm |       76  |       98  |       99  |       99  |
|    sysbnch-2vm |       -1  |        3  |       -1  |       -4  |
|  SPINHisto-2vm |       82  |       94  |       96  |       99  |
|    sysbnch-4vm |        0  |       -2  |       -8  |      -14  |
|  SPINHisto-4vm |       57  |       79  |       88  |       95  |
+----------------+-----------+-----------+-----------+-----------+

result for 64  vcpu guest
=========================
+----------------+-----------+-----------+-----------+-----------+
|    Base-2k     |     4k    |    8k     |   16k     |    32k    |
+----------------+-----------+-----------+-----------+-----------+
|     kbench-1vm |        1  |      -11  |      -25  |       31  |
|  SPINHisto-1vm |        3  |       10  |       47  |       99  |
|     kbench-2vm |       15  |       -9  |      -66  |      -15  |
|  SPINHisto-2vm |        2  |       11  |       19  |       90  |
+----------------+-----------+-----------+-----------+-----------+
|     ebizzy-1vm |      784  |     1097  |      978  |      930  |
|  SPINHisto-1vm |       74  |       97  |       98  |       99  |
|     ebizzy-2vm |       43  |       48  |       56  |       32  |
|  SPINHisto-2vm |       58  |       93  |       97  |       98  |
+----------------+-----------+-----------+-----------+-----------+
|     hbench-1vm |        8  |       55  |       56  |       62  |
|  SPINHisto-1vm |       18  |       69  |       96  |       99  |
|     hbench-2vm |       13  |      -14  |      -75  |      -29  |
|  SPINHisto-2vm |       57  |       74  |       80  |       97  |
+----------------+-----------+-----------+-----------+-----------+
|    sysbnch-1vm |        9  |       11  |       15  |       10  |
|  SPINHisto-1vm |       80  |       93  |       98  |       99  |
|    sysbnch-2vm |        3  |        3  |        4  |        2  |
|  SPINHisto-2vm |       72  |       89  |       94  |       97  |
+----------------+-----------+-----------+-----------+-----------+

 From this, value around 4k-8k threshold seem to be optimal one. [ This 
is amost inline with ple_window default ]
(lower the spin threshold, we would cover lesser % of spinlocks, that 
would result in more halt_exit/wakeups.

[ www.xen.org/files/xensummitboston08/LHP.pdf also has good graphical 
detail on covering spinlock waits ]

After 8k threshold, we see no more contention but that would mean we 
have wasted lot of cpu time in busy waits.

Will get a PLE machine again, and 'll continue experimenting with 
further tuning of SPIN_THRESHOLD.

^ permalink raw reply

* Re: [PATCH RFC V8 17/17] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock
From: Jan Kiszka @ 2012-05-30 11:54 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: Jeremy Fitzhardinge, Greg Kroah-Hartman, Andi Kleen, KVM,
	Konrad Rzeszutek Wilk, Stefano Stabellini, X86, linux-doc, LKML,
	Ingo Molnar, Srivatsa Vaddagiri, Avi Kivity, H. Peter Anvin,
	Attilio Rao, Virtualization, Linus Torvalds, Xen Devel,
	Stephan Diestelhorst
In-Reply-To: <20120502100947.13206.26518.sendpatchset@codeblue.in.ibm.com>

On 2012-05-02 12:09, Raghavendra K T wrote:
> From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> 
> 
> KVM_HC_KICK_CPU  hypercall added to wakeup halted vcpu in paravirtual spinlock
> enabled guest.
> 
> KVM_FEATURE_PV_UNHALT enables guest to check whether pv spinlock can be enabled
> in guest.
> 
> Thanks Alex for KVM_HC_FEATURES inputs and Vatsa for rewriting KVM_HC_KICK_CPU

This contains valuable documentation for features that are already
supported. Can you break them out and post as separate patch already?
One comment on them below.

> 
> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> ---
>  Documentation/virtual/kvm/cpuid.txt      |    4 ++
>  Documentation/virtual/kvm/hypercalls.txt |   60 ++++++++++++++++++++++++++++++
>  2 files changed, 64 insertions(+), 0 deletions(-)
> diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> index 8820685..062dff9 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -39,6 +39,10 @@ KVM_FEATURE_CLOCKSOURCE2           ||     3 || kvmclock available at msrs
>  KVM_FEATURE_ASYNC_PF               ||     4 || async pf can be enabled by
>                                     ||       || writing to msr 0x4b564d02
>  ------------------------------------------------------------------------------
> +KVM_FEATURE_PV_UNHALT              ||     6 || guest checks this feature bit
> +                                   ||       || before enabling paravirtualized
> +                                   ||       || spinlock support.
> +------------------------------------------------------------------------------
>  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
>                                     ||       || per-cpu warps are expected in
>                                     ||       || kvmclock.
> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> new file mode 100644
> index 0000000..bc3f14a
> --- /dev/null
> +++ b/Documentation/virtual/kvm/hypercalls.txt
> @@ -0,0 +1,60 @@
> +KVM Hypercalls Documentation
> +===========================
> +The template for each hypercall is:
> +1. Hypercall name, value.
> +2. Architecture(s)
> +3. Status (deprecated, obsolete, active)
> +4. Purpose
> +
> +1. KVM_HC_VAPIC_POLL_IRQ
> +------------------------
> +Value: 1
> +Architecture: x86
> +Purpose: None

Purpose: Trigger guest exit so that the host can check for pending
interrupts on reentry.

> +
> +2. KVM_HC_MMU_OP
> +------------------------
> +Value: 2
> +Architecture: x86
> +Status: deprecated.
> +Purpose: Support MMU operations such as writing to PTE,
> +flushing TLB, release PT.
> +
> +3. KVM_HC_FEATURES
> +------------------------
> +Value: 3
> +Architecture: PPC
> +Status: active
> +Purpose: Expose hypercall availability to the guest. On x86 platforms, cpuid
> +used to enumerate which hypercalls are available. On PPC, either device tree
> +based lookup ( which is also what EPAPR dictates) OR KVM specific enumeration
> +mechanism (which is this hypercall) can be used.
> +
> +4. KVM_HC_PPC_MAP_MAGIC_PAGE
> +------------------------
> +Value: 4
> +Architecture: PPC
> +Status: active
> +Purpose: To enable communication between the hypervisor and guest there is a
> +shared page that contains parts of supervisor visible register state.
> +The guest can map this shared page to access its supervisor register through
> +memory using this hypercall.
> +
> +5. KVM_HC_KICK_CPU
> +------------------------
> +Value: 5
> +Architecture: x86
> +Status: active
> +Purpose: Hypercall used to wakeup a vcpu from HLT state
> +
> +Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
> +kernel mode for an event to occur (ex: a spinlock to become available) can
> +execute HLT instruction once it has busy-waited for more than a threshold
> +time-interval. Execution of HLT instruction would cause the hypervisor to put
> +the vcpu to sleep until occurence of an appropriate event. Another vcpu of the
> +same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
> +specifying APIC ID of the vcpu to be wokenup.
> +
> +TODO:
> +1. more information on input and output needed?
> +2. Add more detail to purpose of hypercalls.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply

* [PATCH v2] virtio_blk: unlock vblk->lock during kick
From: Stefan Hajnoczi @ 2012-05-30 13:19 UTC (permalink / raw)
  To: virtualization; +Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin, khoa

Holding the vblk->lock across kick causes poor scalability in SMP
guests.  If one CPU is doing virtqueue kick and another CPU touches the
vblk->lock it will have to spin until virtqueue kick completes.

This patch reduces system% CPU utilization in SMP guests that are
running multithreaded I/O-bound workloads.  The improvements are small
but show as iops and SMP are increased.

Khoa Huynh <khoa@us.ibm.com> provided initial performance data that
indicates this optimization is worthwhile at high iops.

Asias He <asias@redhat.com> reports the following fio results:

Host: Linux 3.4.0+ #302 SMP x86_64 GNU/Linux
Guest: same as host kernel

Average 3 runs:
with locked kick
-------------------------
read    iops=119907.50 bw=59954.00 runt=35018.50 io=2048.00
write   iops=217187.00 bw=108594.00 runt=19312.00 io=2048.00
read    iops=33948.00 bw=16974.50 runt=186820.50 io=3095.70
write   iops=35014.00 bw=17507.50 runt=181151.00 io=3095.70
clat (usec)     max=3484.10 avg=121085.38 stdev=174416.11 min=0.00
clat (usec)     max=3438.30 avg=59863.35 stdev=116607.69 min=0.00
clat (usec)     max=3745.65 avg=454501.30 stdev=332699.00 min=0.00
clat (usec)     max=4089.75 avg=442374.99 stdev=304874.62 min=0.00
cpu     sys=615.12 majf=24080.50 ctx=64253616.50 usr=68.08 minf=17907363.00
cpu     sys=1235.95 majf=23389.00 ctx=59788148.00 usr=98.34 minf=20020008.50
cpu     sys=764.96 majf=28414.00 ctx=848279274.00 usr=36.39 minf=19737254.00
cpu     sys=714.13 majf=21853.50 ctx=854608972.00 usr=33.56 minf=18256760.50

with unlocked kick
-------------------------
read    iops=118559.00 bw=59279.66 runt=35400.66 io=2048.00
write   iops=227560.00 bw=113780.33 runt=18440.00 io=2048.00
read    iops=34567.66 bw=17284.00 runt=183497.33 io=3095.70
write   iops=34589.33 bw=17295.00 runt=183355.00 io=3095.70
clat (usec)     max=3485.56 avg=121989.58 stdev=197355.15 min=0.00
clat (usec)     max=3222.33 avg=57784.11 stdev=141002.89 min=0.00
clat (usec)     max=4060.93 avg=447098.65 stdev=315734.33 min=0.00
clat (usec)     max=3656.30 avg=447281.70 stdev=314051.33 min=0.00
cpu     sys=683.78 majf=24501.33 ctx=64435364.66 usr=68.91 minf=17907893.33
cpu     sys=1218.24 majf=25000.33 ctx=60451475.00 usr=101.04 minf=19757720.00
cpu     sys=740.39 majf=24809.00 ctx=845290443.66 usr=37.25 minf=19349958.33
cpu     sys=723.63 majf=27597.33 ctx=850199927.33 usr=35.35 minf=19092343.00

FIO config file
-------------------------------------

[global]
exec_prerun="echo 3 > /proc/sys/vm/drop_caches"
group_reporting
norandommap
ioscheduler=noop
thread
bs=512
size=4MB
direct=1
filename=/dev/vdb
numjobs=256
ioengine=aio
iodepth=64
loops=3

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 drivers/block/virtio_blk.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 693187d..1a50f41 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -201,8 +201,14 @@ static void do_virtblk_request(struct request_queue *q)
 		issued++;
 	}
 
-	if (issued)
-		virtqueue_kick(vblk->vq);
+	if (!issued)
+		return;
+
+	if (virtqueue_kick_prepare(vblk->vq)) {
+		spin_unlock(&vblk->lock);
+		virtqueue_notify(vblk->vq);
+		spin_lock(&vblk->lock);
+	}
 }
 
 /* return id (s/n) string for *disk to *id_str
-- 
1.7.10

^ permalink raw reply related

* Re: [PATCH v2] virtio_blk: unlock vblk->lock during kick
From: Christian Borntraeger @ 2012-05-30 13:39 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm, Michael S. Tsirkin, virtualization, khoa
In-Reply-To: <1338383963-17910-1-git-send-email-stefanha@linux.vnet.ibm.com>

On 30/05/12 15:19, Stefan Hajnoczi wrote:
> Holding the vblk->lock across kick causes poor scalability in SMP
> guests.  If one CPU is doing virtqueue kick and another CPU touches the
> vblk->lock it will have to spin until virtqueue kick completes.
> 
> This patch reduces system% CPU utilization in SMP guests that are
> running multithreaded I/O-bound workloads.  The improvements are small
> but show as iops and SMP are increased.

Funny, recently I got a bug report regarding spinlock lockup
(see http://lkml.indiana.edu/hypermail/linux/kernel/1205.3/02201.html)
Turned out that blk_done was called on many guest cpus while the guest
was heavily paging on one virtio block device. (and the guest had much
more cpus than the host)
This patch will probably reduce the pressure for those cases as well.
we can then finish requests if somebody else is doing the kick.

IIRC there were some other approaches to address this lock holding during
kick but this looks like the less intrusive one.

> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>

^ permalink raw reply

* Re: [PATCH RFC V8 17/17] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock
From: Raghavendra K T @ 2012-05-30 13:44 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Jeremy Fitzhardinge, Greg Kroah-Hartman, Andi Kleen, KVM,
	Konrad Rzeszutek Wilk, Stefano Stabellini, X86, linux-doc, LKML,
	Ingo Molnar, Srivatsa Vaddagiri, Avi Kivity, H. Peter Anvin,
	Attilio Rao, Virtualization, Linus Torvalds, Xen Devel,
	Stephan Diestelhorst
In-Reply-To: <4FC60A71.5050804@siemens.com>

On 05/30/2012 05:24 PM, Jan Kiszka wrote:
> On 2012-05-02 12:09, Raghavendra K T wrote:
>> From: Raghavendra K T<raghavendra.kt@linux.vnet.ibm.com>
>>
>> KVM_HC_KICK_CPU  hypercall added to wakeup halted vcpu in paravirtual spinlock
>> enabled guest.
>>
>> KVM_FEATURE_PV_UNHALT enables guest to check whether pv spinlock can be enabled
>> in guest.
>>
>> Thanks Alex for KVM_HC_FEATURES inputs and Vatsa for rewriting KVM_HC_KICK_CPU
>
> This contains valuable documentation for features that are already
> supported. Can you break them out and post as separate patch already?
> One comment on them below.
>

That sounds like a good idea. Sure, will do that.

>>
>> Signed-off-by: Srivatsa Vaddagiri<vatsa@linux.vnet.ibm.com>
>> Signed-off-by: Raghavendra K T<raghavendra.kt@linux.vnet.ibm.com>
>> ---
>>   Documentation/virtual/kvm/cpuid.txt      |    4 ++
>>   Documentation/virtual/kvm/hypercalls.txt |   60 ++++++++++++++++++++++++++++++
>>   2 files changed, 64 insertions(+), 0 deletions(-)
>> diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
>> index 8820685..062dff9 100644
>> --- a/Documentation/virtual/kvm/cpuid.txt
>> +++ b/Documentation/virtual/kvm/cpuid.txt
>> @@ -39,6 +39,10 @@ KVM_FEATURE_CLOCKSOURCE2           ||     3 || kvmclock available at msrs
>>   KVM_FEATURE_ASYNC_PF               ||     4 || async pf can be enabled by
>>                                      ||       || writing to msr 0x4b564d02
>>   ------------------------------------------------------------------------------
>> +KVM_FEATURE_PV_UNHALT              ||     6 || guest checks this feature bit
>> +                                   ||       || before enabling paravirtualized
>> +                                   ||       || spinlock support.
>> +------------------------------------------------------------------------------
>>   KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
>>                                      ||       || per-cpu warps are expected in
>>                                      ||       || kvmclock.
>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
>> new file mode 100644
>> index 0000000..bc3f14a
>> --- /dev/null
>> +++ b/Documentation/virtual/kvm/hypercalls.txt
>> @@ -0,0 +1,60 @@
>> +KVM Hypercalls Documentation
>> +===========================
>> +The template for each hypercall is:
>> +1. Hypercall name, value.
>> +2. Architecture(s)
>> +3. Status (deprecated, obsolete, active)
>> +4. Purpose
>> +
>> +1. KVM_HC_VAPIC_POLL_IRQ
>> +------------------------
>> +Value: 1
>> +Architecture: x86
>> +Purpose: None
>
> Purpose: Trigger guest exit so that the host can check for pending
> interrupts on reentry.

will add fold this and resend.

[...]

^ permalink raw reply

* [PATCH repost] virtio-net: remove useless disable on freeze
From: Michael S. Tsirkin @ 2012-05-30 14:21 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin, virtualization, netdev,
	linux-kernel

disable_cb is just an optimization: it
can not guarantee that there are no callbacks.
In particular it doesn't have any effect when
event index is on.

Instead, detach, napi disable and reset on freeze ensure we don't run
concurrently with a callback.

Remove the useless calls so we get same behaviour
with and without event index.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Reposting a patch that seems to have fallen through cracks.

 drivers/net/virtio_net.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9ce6995..5214b1e 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1231,11 +1231,6 @@ static int virtnet_freeze(struct virtio_device *vdev)
 	vi->config_enable = false;
 	mutex_unlock(&vi->config_lock);
 
-	virtqueue_disable_cb(vi->rvq);
-	virtqueue_disable_cb(vi->svq);
-	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ))
-		virtqueue_disable_cb(vi->cvq);
-
 	netif_device_detach(vi->dev);
 	cancel_delayed_work_sync(&vi->refill);
 
-- 
MST

^ permalink raw reply related

* Re: [PATCH repost] virtio-net: remove useless disable on freeze
From: David Miller @ 2012-05-30 20:36 UTC (permalink / raw)
  To: mst; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <20120530142128.GA31299@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Wed, 30 May 2012 17:21:29 +0300

> disable_cb is just an optimization: it
> can not guarantee that there are no callbacks.
> In particular it doesn't have any effect when
> event index is on.
> 
> Instead, detach, napi disable and reset on freeze ensure we don't run
> concurrently with a callback.
> 
> Remove the useless calls so we get same behaviour
> with and without event index.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Reposting a patch that seems to have fallen through cracks.

It seems to not make it to the lists and get properly picked up
by patchwork.

In any event, I've applied this, thanks.

^ permalink raw reply

* Re: [PATCH RFC] virtio-net: remove useless disable on freeze
From: Stephen Rothwell @ 2012-05-31  8:35 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kvm, Michael S. Tsirkin, netdev, linux-kernel, virtualization,
	Amit Shah
In-Reply-To: <87r4u2dllo.fsf@rustcorp.com.au>


[-- Attachment #1.1: Type: text/plain, Size: 1213 bytes --]

Hi all,

On Wed, 30 May 2012 19:41:47 +0930 Rusty Russell <rusty@rustcorp.com.au> wrote:
>
> On Mon, 28 May 2012 15:53:25 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
> > > disable_cb is just an optimization: it
> > > can not guarantee that there are no callbacks.
> > > 
> > > I didn't yet figure out whether a callback
> > > in freeze will trigger a bug, but disable_cb
> > > won't address it in any case. So let's remove
> > > the useless calls as a first step.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > Looks like this isn't in the 3.5 pull request -
> > just lost in the shuffle?
> > disable_cb is advisory so can't be relied upon.
> 
> I always (try to?) reply as I accept patches.
> 
> This one did slip by, but it's harmless so no need to push AFAICT.
> 
> Applied.

This patch exists in two trees in linux-next already ... Davem's net tree
(so presumably he will send it to Linus shortly) and Michael's vhost tree
(is that tree needed any more?).  Presumably it is now also in the rr
tree?

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH RFC] virtio-net: remove useless disable on freeze
From: Michael S. Tsirkin @ 2012-05-31  8:47 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: kvm, netdev, linux-kernel, virtualization, Amit Shah
In-Reply-To: <20120531183508.f426eb25fe1b94139c637348@canb.auug.org.au>

On Thu, May 31, 2012 at 06:35:08PM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> On Wed, 30 May 2012 19:41:47 +0930 Rusty Russell <rusty@rustcorp.com.au> wrote:
> >
> > On Mon, 28 May 2012 15:53:25 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
> > > > disable_cb is just an optimization: it
> > > > can not guarantee that there are no callbacks.
> > > > 
> > > > I didn't yet figure out whether a callback
> > > > in freeze will trigger a bug, but disable_cb
> > > > won't address it in any case. So let's remove
> > > > the useless calls as a first step.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > 
> > > Looks like this isn't in the 3.5 pull request -
> > > just lost in the shuffle?
> > > disable_cb is advisory so can't be relied upon.
> > 
> > I always (try to?) reply as I accept patches.
> > 
> > This one did slip by, but it's harmless so no need to push AFAICT.
> > 
> > Applied.
> 
> This patch exists in two trees in linux-next already ... Davem's net tree
> (so presumably he will send it to Linus shortly) and Michael's vhost tree
> (is that tree needed any more?).

Yes and I dropped the patch from there, just did not push yet.

>  Presumably it is now also in the rr
> tree?
> 
> -- 
> Cheers,
> Stephen Rothwell                    sfr@canb.auug.org.au

^ permalink raw reply

* Re: [PATCH RFC] virtio-net: remove useless disable on freeze
From: Michael S. Tsirkin @ 2012-05-31  9:04 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: kvm, netdev, linux-kernel, virtualization, Amit Shah
In-Reply-To: <20120531084717.GB32310@redhat.com>

On Thu, May 31, 2012 at 11:47:17AM +0300, Michael S. Tsirkin wrote:
> On Thu, May 31, 2012 at 06:35:08PM +1000, Stephen Rothwell wrote:
> > Hi all,
> > 
> > On Wed, 30 May 2012 19:41:47 +0930 Rusty Russell <rusty@rustcorp.com.au> wrote:
> > >
> > > On Mon, 28 May 2012 15:53:25 +0300, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
> > > > > disable_cb is just an optimization: it
> > > > > can not guarantee that there are no callbacks.
> > > > > 
> > > > > I didn't yet figure out whether a callback
> > > > > in freeze will trigger a bug, but disable_cb
> > > > > won't address it in any case. So let's remove
> > > > > the useless calls as a first step.
> > > > > 
> > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > 
> > > > Looks like this isn't in the 3.5 pull request -
> > > > just lost in the shuffle?
> > > > disable_cb is advisory so can't be relied upon.
> > > 
> > > I always (try to?) reply as I accept patches.
> > > 
> > > This one did slip by, but it's harmless so no need to push AFAICT.
> > > 
> > > Applied.
> > 
> > This patch exists in two trees in linux-next already ... Davem's net tree
> > (so presumably he will send it to Linus shortly) and Michael's vhost tree
> > (is that tree needed any more?).
> 
> Yes and I dropped the patch from there, just did not push yet.

pushed.
There's usually not too much in my tree but it helps me a lot
that it's in linux-next.

> >  Presumably it is now also in the rr
> > tree?
> > 
> > -- 
> > Cheers,
> > Stephen Rothwell                    sfr@canb.auug.org.au
> 
> 

^ permalink raw reply

* CFP: Workshop Web Site: https://sites.google.com/site/federatedclouds/
From: Gregor von Laszewski @ 2012-05-31 14:43 UTC (permalink / raw)
  To: virtualization; +Cc: Gregor von Laszewski


[-- Attachment #1.1: Type: text/plain, Size: 3354 bytes --]

We apologize if you received multiple copies

Call For Papers

Workshop on Cloud Services, Federation, and the 8th Open Cirrus Summit

    Workshop Web Site: https://sites.google.com/site/federatedclouds/

    In conjunction with the "International Conference on Autonomic
                             Computing, San Jose, 18-20 Sep. 2012"

    Location: San Jose, CA (USA)
    Date: 21 September 2012
    Contact: ocsfcw2012@googlegroups.com

Deadlines:

    Papers:                    July 14, 2012
    Author Notification:   August 7, 2012
    Final Papers due:      September 7, 2012
    Workshop:               September 21, 2012


Summary:

This event will be run as a workshop in conjunction with ICAC 2012 and
will bring together researchers and practitioners to discuss the
newest ideas and challenges in cloud services and federated cloud
computing. With your help, we hope to accelerate the discussion in
both commercial and academic contexts.  Focus of the Workshop:

The services offered by clouds are becoming critical for a wide
variety of applications used by industry, education and
government.There are now many examples of successful cloud services
offered by public, private and community clouds. Many efforts exist
that are creating cloud toolkits and frameworks to simplify the
development and delivery of cloud services.

The main purpose of this workshop is to bring together those
responsible for designing, managing, and operating clouds services so
that they can share experiences with each other.  The workshop also
welcomes users with requirements for new cloud services.

We are particularly interested in cloud services that can be used for
federating clouds. Topics of interest include:

        Experiences, best practices, and lessons learned from
            operating cloud services
        Testbeds for designing new cloud services
        Cloud services for federating clouds
        Management and provisioning of cloud services
        Health and status monitoring of cloud services
        Security of cloud services
        Requirements for new cloud services
        Reliability and fault tolerance of cloud services
        Cloud services that span public and private clouds
        Design of cloud services
            Intercloud services
            Federation service
            Identity services
            Cloud bursting services
            Cloud services for emerging applications
        Applications utilizing such services

This workshop will build upon the success of the prior Open Cirrus
events and the prior Open Cloud Consortium events.  The goal is to
help building a community for those responsible for operating clouds
and cloud testbeds, as well as those interested in designing new cloud
services.

The Workshop is co-sponsored by the Open Cirrus Consortium and the
Open Cloud Consortium, and FutureGrid.

Paper Format

Full papers (a maximum of 6 pages in the two-column ACM proceedings
format) are invited on a wide variety of topics relating to federated
clouds and their application. Submitted papers must be original work,
and may not be under consideration for another conference or
journal. Complete formatting and submission instructions can be found
on the workshop web site. Accepted papers will appear in proceedings
distributed at the conference and available electronically.

[-- Attachment #1.2: Type: text/html, Size: 3754 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH v2] virtio_blk: unlock vblk->lock during kick
From: Asias He @ 2012-06-01  4:38 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: khoa, Michael S. Tsirkin, kvm, virtualization
In-Reply-To: <1338383963-17910-1-git-send-email-stefanha@linux.vnet.ibm.com>

Hello Stefan,

On 05/30/2012 09:19 PM, Stefan Hajnoczi wrote:
> Holding the vblk->lock across kick causes poor scalability in SMP
> guests.  If one CPU is doing virtqueue kick and another CPU touches the
> vblk->lock it will have to spin until virtqueue kick completes.
>
> This patch reduces system% CPU utilization in SMP guests that are
> running multithreaded I/O-bound workloads.  The improvements are small
> but show as iops and SMP are increased.
>
> Khoa Huynh<khoa@us.ibm.com>  provided initial performance data that
> indicates this optimization is worthwhile at high iops.
>
> Asias He<asias@redhat.com>  reports the following fio results:
>
> Host: Linux 3.4.0+ #302 SMP x86_64 GNU/Linux
> Guest: same as host kernel
>
> Average 3 runs:
> with locked kick
> -------------------------
> read    iops=119907.50 bw=59954.00 runt=35018.50 io=2048.00
> write   iops=217187.00 bw=108594.00 runt=19312.00 io=2048.00
> read    iops=33948.00 bw=16974.50 runt=186820.50 io=3095.70
> write   iops=35014.00 bw=17507.50 runt=181151.00 io=3095.70
> clat (usec)     max=3484.10 avg=121085.38 stdev=174416.11 min=0.00
> clat (usec)     max=3438.30 avg=59863.35 stdev=116607.69 min=0.00
> clat (usec)     max=3745.65 avg=454501.30 stdev=332699.00 min=0.00
> clat (usec)     max=4089.75 avg=442374.99 stdev=304874.62 min=0.00
> cpu     sys=615.12 majf=24080.50 ctx=64253616.50 usr=68.08 minf=17907363.00
> cpu     sys=1235.95 majf=23389.00 ctx=59788148.00 usr=98.34 minf=20020008.50
> cpu     sys=764.96 majf=28414.00 ctx=848279274.00 usr=36.39 minf=19737254.00
> cpu     sys=714.13 majf=21853.50 ctx=854608972.00 usr=33.56 minf=18256760.50
>
> with unlocked kick
> -------------------------
> read    iops=118559.00 bw=59279.66 runt=35400.66 io=2048.00
> write   iops=227560.00 bw=113780.33 runt=18440.00 io=2048.00
> read    iops=34567.66 bw=17284.00 runt=183497.33 io=3095.70
> write   iops=34589.33 bw=17295.00 runt=183355.00 io=3095.70
> clat (usec)     max=3485.56 avg=121989.58 stdev=197355.15 min=0.00
> clat (usec)     max=3222.33 avg=57784.11 stdev=141002.89 min=0.00
> clat (usec)     max=4060.93 avg=447098.65 stdev=315734.33 min=0.00
> clat (usec)     max=3656.30 avg=447281.70 stdev=314051.33 min=0.00
> cpu     sys=683.78 majf=24501.33 ctx=64435364.66 usr=68.91 minf=17907893.33
> cpu     sys=1218.24 majf=25000.33 ctx=60451475.00 usr=101.04 minf=19757720.00
> cpu     sys=740.39 majf=24809.00 ctx=845290443.66 usr=37.25 minf=19349958.33
> cpu     sys=723.63 majf=27597.33 ctx=850199927.33 usr=35.35 minf=19092343.00
>
> FIO config file
> -------------------------------------
>
> [global]
> exec_prerun="echo 3>  /proc/sys/vm/drop_caches"
> group_reporting
> norandommap
> ioscheduler=noop
> thread
> bs=512
> size=4MB
> direct=1
> filename=/dev/vdb
> numjobs=256
> ioengine=aio
> iodepth=64
> loops=3
>
> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com>
> ---
>   drivers/block/virtio_blk.c |   10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 693187d..1a50f41 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -201,8 +201,14 @@ static void do_virtblk_request(struct request_queue *q)
>   		issued++;
>   	}
>
> -	if (issued)
> -		virtqueue_kick(vblk->vq);
> +	if (!issued)
> +		return;
> +
> +	if (virtqueue_kick_prepare(vblk->vq)) {
> +		spin_unlock(&vblk->lock);
> +		virtqueue_notify(vblk->vq);
> +		spin_lock(&vblk->lock);
> +	}
>   }

Could you use vblk->disk->queue->queue_lock to reference the lock so 
that this patch will work on top of this one:

    virtio-blk: Use block layer provided spinlock

BTW. Why the function name is changed to virtqueue_notify() from 
virtqueue_kick_notify()? Seems Rusty renamed it. I think the latter name 
is better because it is more consistent and easier to remember.

    virtqueue_kick()
    virtqueue_kick_prepare()
    virtqueue_kick_notify()

I believe you used virtqueue_kick_notify() in your original patch.
See:
    http://www.spinics.net/lists/linux-virtualization/msg14616.html

-- 
Asias

^ permalink raw reply

* Question regarding Virtio Console and Remoteproc
From: Sjur BRENDELAND @ 2012-06-01  7:31 UTC (permalink / raw)
  To: Amit Shah, Rusty Russell,
	virtualization@lists.linux-foundation.org
  Cc: Linus Walleij, Arnd Bergmann

Hi Amit and Rusty,

I've been looking into the possibility of using the Virtio Console
Driver together with the remoteproc framework to communicate with
ST-Ericsson modem over shared memory.

It seems like Virtio Console would be a good fit, except for a issue
with buffer allocation. Due to HW limitations the STE-Modem cannot
access kernel memory (no IOMMU and limited address range). Instead
we have a designated shared memory region used for IPC. 

Due to this I cannot use kmalloc() for buffer allocation, but I
have to allocate buffers from the memory region shared with the
modem.

In remoteproc this is solved by using dma_alloc_coherent() for all
memory to be shared with the modem. This works fine for me, because
I can pass the IPC memory region to dma_declare_coherent_memory() 
so dma_alloc_coherent() will allocate from this memory region.

I think I can solve this issue in Virtio Console by changing calls
to kmalloc() to something like:

	if (virtio_has_feature(vdev, VIRTIO_CONSOLE_USE_DMA_MEM)) {
		dma_addr_t dma;
		buf = dma_alloc_coherent(dev, size, &dma, GFP_KERNEL);
	} else
		buf = kmalloc(count, GFP_KERNEL);

I'd like to get the opinion from you virtualization folks on this!
If you think it looks reasonable I might start cooking some patches...

Regards,
Sjur

^ permalink raw reply

* Re: Question regarding Virtio Console and Remoteproc
From: Ohad Ben-Cohen @ 2012-06-01  7:51 UTC (permalink / raw)
  To: Sjur BRENDELAND
  Cc: Amit Shah, Linus Walleij, Arnd Bergmann,
	virtualization@lists.linux-foundation.org
In-Reply-To: <81C3A93C17462B4BBD7E272753C10579232F4FB338@EXDCVYMBSTM005.EQ1STM.local>

On Fri, Jun 1, 2012 at 10:31 AM, Sjur BRENDELAND
<sjur.brandeland@stericsson.com> wrote:
>        if (virtio_has_feature(vdev, VIRTIO_CONSOLE_USE_DMA_MEM)) {
>                dma_addr_t dma;
>                buf = dma_alloc_coherent(dev, size, &dma, GFP_KERNEL);
>        } else
>                buf = kmalloc(count, GFP_KERNEL);

Something along those lines is also needed for remote processors which
access memory via an IOMMU (e.g. OMAP4's M3 and DSP).

Allocating the memory via the DMA API will seamlessly configure the
relevant IOMMU as needed, and will make the buffers accessible to the
remote processors.

Thanks,
Ohad.

^ permalink raw reply

* Re: [PATCH v2] virtio_blk: unlock vblk->lock during kick
From: Stefan Hajnoczi @ 2012-06-01  7:58 UTC (permalink / raw)
  To: Asias He; +Cc: khoa, virtualization, Stefan Hajnoczi, kvm, Michael S. Tsirkin
In-Reply-To: <4FC84734.3070808@redhat.com>

On Fri, Jun 1, 2012 at 5:38 AM, Asias He <asias@redhat.com> wrote:
> On 05/30/2012 09:19 PM, Stefan Hajnoczi wrote:
> Could you use vblk->disk->queue->queue_lock to reference the lock so that
> this patch will work on top of this one:
>
>   virtio-blk: Use block layer provided spinlock

Absolutely.  I'll rebased on top of your patches and resend.

> BTW. Why the function name is changed to virtqueue_notify() from
> virtqueue_kick_notify()? Seems Rusty renamed it. I think the latter name is
> better because it is more consistent and easier to remember.
>
>   virtqueue_kick()
>   virtqueue_kick_prepare()
>   virtqueue_kick_notify()
>
> I believe you used virtqueue_kick_notify() in your original patch.
> See:
>   http://www.spinics.net/lists/linux-virtualization/msg14616.html

You are right, it was renamed upstream.  I liked to old name better
too but am happy to use what's there now.

Stefan

^ permalink raw reply

* Re: [PATCH v2 3/3] virtio-blk: Use block layer provided spinlock
From: Michael S. Tsirkin @ 2012-06-01  8:49 UTC (permalink / raw)
  To: Asias He; +Cc: kvm, virtualization
In-Reply-To: <1337933007-30634-1-git-send-email-asias@redhat.com>

On Fri, May 25, 2012 at 04:03:27PM +0800, Asias He wrote:
> Block layer will allocate a spinlock for the queue if the driver does
> not provide one in blk_init_queue().
> 
> The reason to use the internal spinlock is that blk_cleanup_queue() will
> switch to use the internal spinlock in the cleanup code path.
> 
>         if (q->queue_lock != &q->__queue_lock)
>                 q->queue_lock = &q->__queue_lock;
> 
> However, processes which are in D state might have taken the driver
> provided spinlock, when the processes wake up, they would release the
> block provided spinlock.
> 
> =====================================
> [ BUG: bad unlock balance detected! ]
> 3.4.0-rc7+ #238 Not tainted
> -------------------------------------
> fio/3587 is trying to release lock (&(&q->__queue_lock)->rlock) at:
> [<ffffffff813274d2>] blk_queue_bio+0x2a2/0x380
> but there are no more locks to release!
> 
> other info that might help us debug this:
> 1 lock held by fio/3587:
>  #0:  (&(&vblk->lock)->rlock){......}, at:
> [<ffffffff8132661a>] get_request_wait+0x19a/0x250
> 
> Other drivers use block layer provided spinlock as well, e.g. SCSI.
> 
> Switching to the block layer provided spinlock saves a bit of memory and
> does not increase lock contention. Performance test shows no real
> difference is observed before and after this patch.
> 
> Changes in v2: Improve commit log as Michael suggested.
> 
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: virtualization@lists.linux-foundation.org
> Cc: kvm@vger.kernel.org
> Signed-off-by: Asias He <asias@redhat.com>

It may also be worth pointing out that external lock
is inherently broken - it's kept around for historical reasons
only.

See also this discussion
https://lkml.org/lkml/2012/5/28/72

Note: a bugfix so 3.5 material I think.

Acked-by: Michael S. Tsirkin <mst@redhat.com>


> ---
>  drivers/block/virtio_blk.c |    9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index b4fa2d7..774c31d 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -21,8 +21,6 @@ struct workqueue_struct *virtblk_wq;
>  
>  struct virtio_blk
>  {
> -	spinlock_t lock;
> -
>  	struct virtio_device *vdev;
>  	struct virtqueue *vq;
>  
> @@ -65,7 +63,7 @@ static void blk_done(struct virtqueue *vq)
>  	unsigned int len;
>  	unsigned long flags;
>  
> -	spin_lock_irqsave(&vblk->lock, flags);
> +	spin_lock_irqsave(vblk->disk->queue->queue_lock, flags);
>  	while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
>  		int error;
>  
> @@ -99,7 +97,7 @@ static void blk_done(struct virtqueue *vq)
>  	}
>  	/* In case queue is stopped waiting for more buffers. */
>  	blk_start_queue(vblk->disk->queue);
> -	spin_unlock_irqrestore(&vblk->lock, flags);
> +	spin_unlock_irqrestore(vblk->disk->queue->queue_lock, flags);
>  }
>  
>  static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
> @@ -431,7 +429,6 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>  		goto out_free_index;
>  	}
>  
> -	spin_lock_init(&vblk->lock);
>  	vblk->vdev = vdev;
>  	vblk->sg_elems = sg_elems;
>  	sg_init_table(vblk->sg, vblk->sg_elems);
> @@ -456,7 +453,7 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
>  		goto out_mempool;
>  	}
>  
> -	q = vblk->disk->queue = blk_init_queue(do_virtblk_request, &vblk->lock);
> +	q = vblk->disk->queue = blk_init_queue(do_virtblk_request, NULL);
>  	if (!q) {
>  		err = -ENOMEM;
>  		goto out_put_disk;
> -- 
> 1.7.10.2

^ permalink raw reply

* [PATCH 03/27] smpboot: Define and use cpu_state per-cpu variable in generic code
From: Srivatsa S. Bhat @ 2012-06-01  9:10 UTC (permalink / raw)
  To: peterz, paulmck
  Cc: Venkatesh Pallipadi, Jeremy Fitzhardinge, linux-ia64, linux-mips,
	Benjamin Herrenschmidt, linux-kernel, H. Peter Anvin, mingo,
	linux-arch, xen-devel, Suresh Siddha, linux-sh, x86, Ingo Molnar,
	Fenghua Yu, Mike Frysinger, Peter Zijlstra, nikunj,
	Konrad Rzeszutek Wilk, Chris Metcalf, rjw, Yong Zhang,
	Thomas Gleixner, virtualization, Tony Luck, vatsa, Ralf Baechle
In-Reply-To: <20120601090952.31979.24799.stgit@srivatsabhat.in.ibm.com>

The per-cpu variable cpu_state is used in x86 and also used in other
architectures, to track the state of the cpu during bringup and hotplug.
Pull it out into generic code.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Yong Zhang <yong.zhang0@gmail.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-sh@vger.kernel.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 arch/ia64/include/asm/cpu.h   |    2 --
 arch/ia64/kernel/process.c    |    1 +
 arch/ia64/kernel/smpboot.c    |    6 +-----
 arch/mips/cavium-octeon/smp.c |    4 +---
 arch/powerpc/kernel/smp.c     |    6 +-----
 arch/sh/include/asm/smp.h     |    2 --
 arch/sh/kernel/smp.c          |    4 +---
 arch/tile/kernel/smpboot.c    |    4 +---
 arch/x86/include/asm/cpu.h    |    2 --
 arch/x86/kernel/smpboot.c     |    4 +---
 arch/x86/xen/smp.c            |    1 +
 include/linux/smpboot.h       |    1 +
 kernel/smpboot.c              |    4 ++++
 13 files changed, 13 insertions(+), 28 deletions(-)

diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
index fcca30b..1c3acac 100644
--- a/arch/ia64/include/asm/cpu.h
+++ b/arch/ia64/include/asm/cpu.h
@@ -12,8 +12,6 @@ struct ia64_cpu {
 
 DECLARE_PER_CPU(struct ia64_cpu, cpu_devices);
 
-DECLARE_PER_CPU(int, cpu_state);
-
 #ifdef CONFIG_HOTPLUG_CPU
 extern int arch_register_cpu(int num);
 extern void arch_unregister_cpu(int);
diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index 5e0e86d..32566c7 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -29,6 +29,7 @@
 #include <linux/kdebug.h>
 #include <linux/utsname.h>
 #include <linux/tracehook.h>
+#include <linux/smpboot.h>
 
 #include <asm/cpu.h>
 #include <asm/delay.h>
diff --git a/arch/ia64/kernel/smpboot.c b/arch/ia64/kernel/smpboot.c
index 963d2db..df00a3c 100644
--- a/arch/ia64/kernel/smpboot.c
+++ b/arch/ia64/kernel/smpboot.c
@@ -39,6 +39,7 @@
 #include <linux/efi.h>
 #include <linux/percpu.h>
 #include <linux/bitops.h>
+#include <linux/smpboot.h>
 
 #include <linux/atomic.h>
 #include <asm/cache.h>
@@ -111,11 +112,6 @@ extern unsigned long ia64_iobase;
 
 struct task_struct *task_for_booting_cpu;
 
-/*
- * State for each CPU
- */
-DEFINE_PER_CPU(int, cpu_state);
-
 cpumask_t cpu_core_map[NR_CPUS] __cacheline_aligned;
 EXPORT_SYMBOL(cpu_core_map);
 DEFINE_PER_CPU_SHARED_ALIGNED(cpumask_t, cpu_sibling_map);
diff --git a/arch/mips/cavium-octeon/smp.c b/arch/mips/cavium-octeon/smp.c
index 97e7ce9..93cd4b0 100644
--- a/arch/mips/cavium-octeon/smp.c
+++ b/arch/mips/cavium-octeon/smp.c
@@ -13,6 +13,7 @@
 #include <linux/kernel_stat.h>
 #include <linux/sched.h>
 #include <linux/module.h>
+#include <linux/smpboot.h>
 
 #include <asm/mmu_context.h>
 #include <asm/time.h>
@@ -252,9 +253,6 @@ static void octeon_cpus_done(void)
 
 #ifdef CONFIG_HOTPLUG_CPU
 
-/* State of each CPU. */
-DEFINE_PER_CPU(int, cpu_state);
-
 extern void fixup_irqs(void);
 
 static DEFINE_SPINLOCK(smp_reserve_lock);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index e1417c4..1928058a 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -31,6 +31,7 @@
 #include <linux/cpu.h>
 #include <linux/notifier.h>
 #include <linux/topology.h>
+#include <linux/smpboot.h>
 
 #include <asm/ptrace.h>
 #include <linux/atomic.h>
@@ -57,11 +58,6 @@
 #define DBG(fmt...)
 #endif
 
-#ifdef CONFIG_HOTPLUG_CPU
-/* State of each CPU during hotplug phases */
-static DEFINE_PER_CPU(int, cpu_state) = { 0 };
-#endif
-
 struct thread_info *secondary_ti;
 
 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
diff --git a/arch/sh/include/asm/smp.h b/arch/sh/include/asm/smp.h
index 78b0d0f4..bda041e 100644
--- a/arch/sh/include/asm/smp.h
+++ b/arch/sh/include/asm/smp.h
@@ -31,8 +31,6 @@ enum {
 	SMP_MSG_NR,	/* must be last */
 };
 
-DECLARE_PER_CPU(int, cpu_state);
-
 void smp_message_recv(unsigned int msg);
 void smp_timer_broadcast(const struct cpumask *mask);
 
diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c
index b86e9ca..8e0fde0 100644
--- a/arch/sh/kernel/smp.c
+++ b/arch/sh/kernel/smp.c
@@ -22,6 +22,7 @@
 #include <linux/interrupt.h>
 #include <linux/sched.h>
 #include <linux/atomic.h>
+#include <linux/smpboot.h>
 #include <asm/processor.h>
 #include <asm/mmu_context.h>
 #include <asm/smp.h>
@@ -34,9 +35,6 @@ int __cpu_logical_map[NR_CPUS];		/* Map logical to physical */
 
 struct plat_smp_ops *mp_ops = NULL;
 
-/* State of each CPU */
-DEFINE_PER_CPU(int, cpu_state) = { 0 };
-
 void __cpuinit register_smp_ops(struct plat_smp_ops *ops)
 {
 	if (mp_ops)
diff --git a/arch/tile/kernel/smpboot.c b/arch/tile/kernel/smpboot.c
index e686c5a..24a9c06 100644
--- a/arch/tile/kernel/smpboot.c
+++ b/arch/tile/kernel/smpboot.c
@@ -25,13 +25,11 @@
 #include <linux/delay.h>
 #include <linux/err.h>
 #include <linux/irq.h>
+#include <linux/smpboot.h>
 #include <asm/mmu_context.h>
 #include <asm/tlbflush.h>
 #include <asm/sections.h>
 
-/* State of each CPU. */
-static DEFINE_PER_CPU(int, cpu_state) = { 0 };
-
 /* The messaging code jumps to this pointer during boot-up */
 unsigned long start_cpu_function_addr;
 
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index 4564c8e..2d0b239 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -30,8 +30,6 @@ extern int arch_register_cpu(int num);
 extern void arch_unregister_cpu(int);
 #endif
 
-DECLARE_PER_CPU(int, cpu_state);
-
 int mwait_usable(const struct cpuinfo_x86 *);
 
 #endif /* _ASM_X86_CPU_H */
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index bfbe30e..269bc1f 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -51,6 +51,7 @@
 #include <linux/stackprotector.h>
 #include <linux/gfp.h>
 #include <linux/cpuidle.h>
+#include <linux/smpboot.h>
 
 #include <asm/acpi.h>
 #include <asm/desc.h>
@@ -73,9 +74,6 @@
 #include <asm/smpboot_hooks.h>
 #include <asm/i8259.h>
 
-/* State of each CPU */
-DEFINE_PER_CPU(int, cpu_state) = { 0 };
-
 #ifdef CONFIG_HOTPLUG_CPU
 /*
  * We need this for trampoline_base protection from concurrent accesses when
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 2ef5948..09a7199 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -16,6 +16,7 @@
 #include <linux/err.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
+#include <linux/smpboot.h>
 
 #include <asm/paravirt.h>
 #include <asm/desc.h>
diff --git a/include/linux/smpboot.h b/include/linux/smpboot.h
index 63bbedd..834d90c 100644
--- a/include/linux/smpboot.h
+++ b/include/linux/smpboot.h
@@ -5,6 +5,7 @@
 #ifndef SMPBOOT_H
 #define SMPBOOT_H
 
+DECLARE_PER_CPU(int, cpu_state);
 extern void smpboot_start_secondary(void *arg);
 
 #endif
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 5ae1805..0df43b0 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -67,6 +67,8 @@ void __init idle_threads_init(void)
 }
 #endif
 
+/* State of each CPU during bringup/teardown */
+DEFINE_PER_CPU(int, cpu_state) = { 0 };
 
 /* Implement the following functions in your architecture, as appropriate. */
 
@@ -141,6 +143,8 @@ void __cpuinit smpboot_start_secondary(void *arg)
 	set_cpu_online(cpu, true);
 	arch_vector_unlock();
 
+	per_cpu(cpu_state, cpu) = CPU_ONLINE;
+
 	__cpu_post_online(arg);
 
 	/* Enable local interrupts now */

^ permalink raw reply related

* [PATCH 05/27] xen, cpu hotplug: Don't call cpu_bringup() in xen_play_dead()
From: Srivatsa S. Bhat @ 2012-06-01  9:11 UTC (permalink / raw)
  To: peterz, paulmck
  Cc: linux-arch, Jeremy Fitzhardinge, x86, nikunj,
	Konrad Rzeszutek Wilk, vatsa, linux-kernel, rjw, yong.zhang0,
	Ingo Molnar, Thomas Gleixner, srivatsa.bhat, H. Peter Anvin,
	xen-devel, akpm, virtualization, mingo
In-Reply-To: <20120601090952.31979.24799.stgit@srivatsabhat.in.ibm.com>

xen_play_dead calls cpu_bringup() which looks weird, because xen_play_dead()
is invoked in the cpu down path, whereas cpu_bringup() (as the name suggests)
is useful in the cpu bringup path.

Getting rid of xen_play_dead()'s dependency on cpu_bringup() helps in hooking
on to the generic SMP booting framework.

Also remove the extra call to preempt_enable() added by commit 41bd956
(xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while
atomic) because it becomes unnecessary after this change.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 arch/x86/xen/smp.c |    8 --------
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 09a7199..602d6b7 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -417,14 +417,6 @@ static void __cpuinit xen_play_dead(void) /* used only with HOTPLUG_CPU */
 {
 	play_dead_common();
 	HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
-	cpu_bringup();
-	/*
-	 * Balance out the preempt calls - as we are running in cpu_idle
-	 * loop which has been called at bootup from cpu_bringup_and_idle.
-	 * The cpucpu_bringup_and_idle called cpu_bringup which made a
-	 * preempt_disable() So this preempt_enable will balance it out.
-	 */
-	preempt_enable();
 }
 
 #else /* !CONFIG_HOTPLUG_CPU */

^ permalink raw reply related

* [PATCH 06/27] xen, smpboot: Use generic SMP booting infrastructure
From: Srivatsa S. Bhat @ 2012-06-01  9:11 UTC (permalink / raw)
  To: peterz, paulmck
  Cc: linux-arch, Jeremy Fitzhardinge, x86, nikunj,
	Konrad Rzeszutek Wilk, vatsa, linux-kernel, rjw, yong.zhang0,
	Ingo Molnar, Thomas Gleixner, srivatsa.bhat, H. Peter Anvin,
	xen-devel, akpm, virtualization, mingo
In-Reply-To: <20120601090952.31979.24799.stgit@srivatsabhat.in.ibm.com>

Convert xen to use the generic framework to boot secondary CPUs.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 arch/x86/xen/smp.c |   21 ++++-----------------
 1 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 602d6b7..46c96f9 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -58,13 +58,12 @@ static irqreturn_t xen_reschedule_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static void __cpuinit cpu_bringup(void)
+void __cpuinit xen_cpu_pre_starting(void *unused)
 {
 	int cpu;
 
 	cpu_init();
 	touch_softlockup_watchdog();
-	preempt_disable();
 
 	xen_enable_sysenter();
 	xen_enable_syscall();
@@ -75,25 +74,11 @@ static void __cpuinit cpu_bringup(void)
 	set_cpu_sibling_map(cpu);
 
 	xen_setup_cpu_clockevents();
-
-	notify_cpu_starting(cpu);
-
-	set_cpu_online(cpu, true);
-
-	this_cpu_write(cpu_state, CPU_ONLINE);
-
-	wmb();
-
-	/* We can take interrupts now: we're officially "up". */
-	local_irq_enable();
-
-	wmb();			/* make sure everything is out */
 }
 
 static void __cpuinit cpu_bringup_and_idle(void)
 {
-	cpu_bringup();
-	cpu_idle();
+	smpboot_start_secondary(NULL);
 }
 
 static int xen_smp_intr_init(unsigned int cpu)
@@ -515,6 +500,8 @@ static const struct smp_ops xen_smp_ops __initconst = {
 	.smp_prepare_cpus = xen_smp_prepare_cpus,
 	.smp_cpus_done = xen_smp_cpus_done,
 
+	.cpu_pre_starting = xen_cpu_pre_starting,
+
 	.cpu_up = xen_cpu_up,
 	.cpu_die = xen_cpu_die,
 	.cpu_disable = xen_cpu_disable,

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox