* [PATCH v3] virtio_blk: unlock vblk->lock during kick
@ 2012-06-01 9:13 Stefan Hajnoczi
2012-06-04 8:33 ` Asias He
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2012-06-01 9:13 UTC (permalink / raw)
To: virtualization; +Cc: Stefan Hajnoczi, kvm, Michael S. Tsirkin, khoa
Holding the vblk->lock across kick causes poor scalability in SMP
guests. If one CPU is doing virtqueue kick and another CPU touches the
vblk->lock it will have to spin until virtqueue kick completes.
This patch reduces system% CPU utilization in SMP guests that are
running multithreaded I/O-bound workloads. The improvements are small
but show as iops and SMP are increased.
Khoa Huynh <khoa@us.ibm.com> provided initial performance data that
indicates this optimization is worthwhile at high iops.
Asias He <asias@redhat.com> reports the following fio results:
Host: Linux 3.4.0+ #302 SMP x86_64 GNU/Linux
Guest: same as host kernel
Average 3 runs:
with locked kick
read iops=119907.50 bw=59954.00 runt=35018.50 io=2048.00
write iops=217187.00 bw=108594.00 runt=19312.00 io=2048.00
read iops=33948.00 bw=16974.50 runt=186820.50 io=3095.70
write iops=35014.00 bw=17507.50 runt=181151.00 io=3095.70
clat (usec) max=3484.10 avg=121085.38 stdev=174416.11 min=0.00
clat (usec) max=3438.30 avg=59863.35 stdev=116607.69 min=0.00
clat (usec) max=3745.65 avg=454501.30 stdev=332699.00 min=0.00
clat (usec) max=4089.75 avg=442374.99 stdev=304874.62 min=0.00
cpu sys=615.12 majf=24080.50 ctx=64253616.50 usr=68.08 minf=17907363.00
cpu sys=1235.95 majf=23389.00 ctx=59788148.00 usr=98.34 minf=20020008.50
cpu sys=764.96 majf=28414.00 ctx=848279274.00 usr=36.39 minf=19737254.00
cpu sys=714.13 majf=21853.50 ctx=854608972.00 usr=33.56 minf=18256760.50
with unlocked kick
read iops=118559.00 bw=59279.66 runt=35400.66 io=2048.00
write iops=227560.00 bw=113780.33 runt=18440.00 io=2048.00
read iops=34567.66 bw=17284.00 runt=183497.33 io=3095.70
write iops=34589.33 bw=17295.00 runt=183355.00 io=3095.70
clat (usec) max=3485.56 avg=121989.58 stdev=197355.15 min=0.00
clat (usec) max=3222.33 avg=57784.11 stdev=141002.89 min=0.00
clat (usec) max=4060.93 avg=447098.65 stdev=315734.33 min=0.00
clat (usec) max=3656.30 avg=447281.70 stdev=314051.33 min=0.00
cpu sys=683.78 majf=24501.33 ctx=64435364.66 usr=68.91 minf=17907893.33
cpu sys=1218.24 majf=25000.33 ctx=60451475.00 usr=101.04 minf=19757720.00
cpu sys=740.39 majf=24809.00 ctx=845290443.66 usr=37.25 minf=19349958.33
cpu sys=723.63 majf=27597.33 ctx=850199927.33 usr=35.35 minf=19092343.00
FIO config file:
[global]
exec_prerun="echo 3 > /proc/sys/vm/drop_caches"
group_reporting
norandommap
ioscheduler=noop
thread
bs=512
size=4MB
direct=1
filename=/dev/vdb
numjobs=256
ioengine=aio
iodepth=64
loops=3
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
Other block drivers (cciss, rbd, nbd) use spin_unlock_irq() so I followed that.
To me this seems wrong: blk_run_queue() uses spin_lock_irqsave() but we enable
irqs with spin_unlock_irq(). If the caller of blk_run_queue() had irqs
disabled and we enable them again this could be a problem, right? Can someone
more familiar with kernel locking comment?
drivers/block/virtio_blk.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 774c31d..d674977 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -199,8 +199,14 @@ static void do_virtblk_request(struct request_queue *q)
issued++;
}
- if (issued)
- virtqueue_kick(vblk->vq);
+ if (!issued)
+ return;
+
+ if (virtqueue_kick_prepare(vblk->vq)) {
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ virtqueue_notify(vblk->vq);
+ spin_lock_irq(vblk->disk->queue->queue_lock);
+ }
}
/* return id (s/n) string for *disk to *id_str
--
1.7.10
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v3] virtio_blk: unlock vblk->lock during kick
2012-06-01 9:13 [PATCH v3] virtio_blk: unlock vblk->lock during kick Stefan Hajnoczi
@ 2012-06-04 8:33 ` Asias He
2012-06-04 11:11 ` Michael S. Tsirkin
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2012-06-04 8:33 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: khoa, Michael S. Tsirkin, kvm, virtualization
On 06/01/2012 05:13 PM, Stefan Hajnoczi wrote:
> Holding the vblk->lock across kick causes poor scalability in SMP
> guests. If one CPU is doing virtqueue kick and another CPU touches the
> vblk->lock it will have to spin until virtqueue kick completes.
>
> This patch reduces system% CPU utilization in SMP guests that are
> running multithreaded I/O-bound workloads. The improvements are small
> but show as iops and SMP are increased.
>
> Khoa Huynh<khoa@us.ibm.com> provided initial performance data that
> indicates this optimization is worthwhile at high iops.
>
> Asias He<asias@redhat.com> reports the following fio results:
>
> Host: Linux 3.4.0+ #302 SMP x86_64 GNU/Linux
> Guest: same as host kernel
>
> Average 3 runs:
> with locked kick
> read iops=119907.50 bw=59954.00 runt=35018.50 io=2048.00
> write iops=217187.00 bw=108594.00 runt=19312.00 io=2048.00
> read iops=33948.00 bw=16974.50 runt=186820.50 io=3095.70
> write iops=35014.00 bw=17507.50 runt=181151.00 io=3095.70
> clat (usec) max=3484.10 avg=121085.38 stdev=174416.11 min=0.00
> clat (usec) max=3438.30 avg=59863.35 stdev=116607.69 min=0.00
> clat (usec) max=3745.65 avg=454501.30 stdev=332699.00 min=0.00
> clat (usec) max=4089.75 avg=442374.99 stdev=304874.62 min=0.00
> cpu sys=615.12 majf=24080.50 ctx=64253616.50 usr=68.08 minf=17907363.00
> cpu sys=1235.95 majf=23389.00 ctx=59788148.00 usr=98.34 minf=20020008.50
> cpu sys=764.96 majf=28414.00 ctx=848279274.00 usr=36.39 minf=19737254.00
> cpu sys=714.13 majf=21853.50 ctx=854608972.00 usr=33.56 minf=18256760.50
>
> with unlocked kick
> read iops=118559.00 bw=59279.66 runt=35400.66 io=2048.00
> write iops=227560.00 bw=113780.33 runt=18440.00 io=2048.00
> read iops=34567.66 bw=17284.00 runt=183497.33 io=3095.70
> write iops=34589.33 bw=17295.00 runt=183355.00 io=3095.70
> clat (usec) max=3485.56 avg=121989.58 stdev=197355.15 min=0.00
> clat (usec) max=3222.33 avg=57784.11 stdev=141002.89 min=0.00
> clat (usec) max=4060.93 avg=447098.65 stdev=315734.33 min=0.00
> clat (usec) max=3656.30 avg=447281.70 stdev=314051.33 min=0.00
> cpu sys=683.78 majf=24501.33 ctx=64435364.66 usr=68.91 minf=17907893.33
> cpu sys=1218.24 majf=25000.33 ctx=60451475.00 usr=101.04 minf=19757720.00
> cpu sys=740.39 majf=24809.00 ctx=845290443.66 usr=37.25 minf=19349958.33
> cpu sys=723.63 majf=27597.33 ctx=850199927.33 usr=35.35 minf=19092343.00
>
> FIO config file:
>
> [global]
> exec_prerun="echo 3> /proc/sys/vm/drop_caches"
> group_reporting
> norandommap
> ioscheduler=noop
> thread
> bs=512
> size=4MB
> direct=1
> filename=/dev/vdb
> numjobs=256
> ioengine=aio
> iodepth=64
> loops=3
>
> Signed-off-by: Stefan Hajnoczi<stefanha@linux.vnet.ibm.com>
> ---
> Other block drivers (cciss, rbd, nbd) use spin_unlock_irq() so I followed that.
> To me this seems wrong: blk_run_queue() uses spin_lock_irqsave() but we enable
> irqs with spin_unlock_irq(). If the caller of blk_run_queue() had irqs
> disabled and we enable them again this could be a problem, right? Can someone
> more familiar with kernel locking comment?
blk_run_queue() is not used in our code path. We use __blk_run_queue().
The code path is:
generic_make_request() -> q->make_request_fn() -> blk_queue_bio()
->__blk_run_queue() -> q->request_fn() -> do_virtblk_request().
__blk_run_queue() is called with interrupts disabled and queue lock
locked. In blk_queue_bio, __blk_run_queue() is protected by
spin_lock_irq(q->queue_lock).
The lock in block layer seems a bit confusing, e.g.block/blk-core.c.
There are fixed use of spin_lock_irq() and spin_lock_irqsave() pair.
>
> drivers/block/virtio_blk.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 774c31d..d674977 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -199,8 +199,14 @@ static void do_virtblk_request(struct request_queue *q)
> issued++;
> }
>
> - if (issued)
> - virtqueue_kick(vblk->vq);
> + if (!issued)
> + return;
> +
> + if (virtqueue_kick_prepare(vblk->vq)) {
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + virtqueue_notify(vblk->vq);
> + spin_lock_irq(vblk->disk->queue->queue_lock);
> + }
> }
>
> /* return id (s/n) string for *disk to *id_str
--
Asias
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] virtio_blk: unlock vblk->lock during kick
2012-06-01 9:13 [PATCH v3] virtio_blk: unlock vblk->lock during kick Stefan Hajnoczi
2012-06-04 8:33 ` Asias He
@ 2012-06-04 11:11 ` Michael S. Tsirkin
2012-06-06 15:25 ` Stefan Hajnoczi
[not found] ` <CAJSP0QWkXeCQKOEMoV6XkNpwbnQczq9Smx85=Tg-73A9fmSyVQ@mail.gmail.com>
2012-06-04 11:15 ` Michael S. Tsirkin
2012-06-04 21:13 ` Khoa Huynh
3 siblings, 2 replies; 8+ messages in thread
From: Michael S. Tsirkin @ 2012-06-04 11:11 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: khoa, kvm, virtualization
On Fri, Jun 01, 2012 at 10:13:06AM +0100, Stefan Hajnoczi wrote:
> Holding the vblk->lock across kick causes poor scalability in SMP
> guests. If one CPU is doing virtqueue kick and another CPU touches the
> vblk->lock it will have to spin until virtqueue kick completes.
>
> This patch reduces system% CPU utilization in SMP guests that are
> running multithreaded I/O-bound workloads. The improvements are small
> but show as iops and SMP are increased.
>
> Khoa Huynh <khoa@us.ibm.com> provided initial performance data that
> indicates this optimization is worthwhile at high iops.
>
> Asias He <asias@redhat.com> reports the following fio results:
>
> Host: Linux 3.4.0+ #302 SMP x86_64 GNU/Linux
> Guest: same as host kernel
>
> Average 3 runs:
> with locked kick
> read iops=119907.50 bw=59954.00 runt=35018.50 io=2048.00
> write iops=217187.00 bw=108594.00 runt=19312.00 io=2048.00
> read iops=33948.00 bw=16974.50 runt=186820.50 io=3095.70
> write iops=35014.00 bw=17507.50 runt=181151.00 io=3095.70
> clat (usec) max=3484.10 avg=121085.38 stdev=174416.11 min=0.00
> clat (usec) max=3438.30 avg=59863.35 stdev=116607.69 min=0.00
> clat (usec) max=3745.65 avg=454501.30 stdev=332699.00 min=0.00
> clat (usec) max=4089.75 avg=442374.99 stdev=304874.62 min=0.00
> cpu sys=615.12 majf=24080.50 ctx=64253616.50 usr=68.08 minf=17907363.00
> cpu sys=1235.95 majf=23389.00 ctx=59788148.00 usr=98.34 minf=20020008.50
> cpu sys=764.96 majf=28414.00 ctx=848279274.00 usr=36.39 minf=19737254.00
> cpu sys=714.13 majf=21853.50 ctx=854608972.00 usr=33.56 minf=18256760.50
>
> with unlocked kick
> read iops=118559.00 bw=59279.66 runt=35400.66 io=2048.00
> write iops=227560.00 bw=113780.33 runt=18440.00 io=2048.00
> read iops=34567.66 bw=17284.00 runt=183497.33 io=3095.70
> write iops=34589.33 bw=17295.00 runt=183355.00 io=3095.70
> clat (usec) max=3485.56 avg=121989.58 stdev=197355.15 min=0.00
> clat (usec) max=3222.33 avg=57784.11 stdev=141002.89 min=0.00
> clat (usec) max=4060.93 avg=447098.65 stdev=315734.33 min=0.00
> clat (usec) max=3656.30 avg=447281.70 stdev=314051.33 min=0.00
> cpu sys=683.78 majf=24501.33 ctx=64435364.66 usr=68.91 minf=17907893.33
> cpu sys=1218.24 majf=25000.33 ctx=60451475.00 usr=101.04 minf=19757720.00
> cpu sys=740.39 majf=24809.00 ctx=845290443.66 usr=37.25 minf=19349958.33
> cpu sys=723.63 majf=27597.33 ctx=850199927.33 usr=35.35 minf=19092343.00
>
> FIO config file:
>
> [global]
> exec_prerun="echo 3 > /proc/sys/vm/drop_caches"
> group_reporting
> norandommap
> ioscheduler=noop
> thread
> bs=512
> size=4MB
> direct=1
> filename=/dev/vdb
> numjobs=256
> ioengine=aio
> iodepth=64
> loops=3
>
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
> Other block drivers (cciss, rbd, nbd) use spin_unlock_irq() so I followed that.
> To me this seems wrong: blk_run_queue() uses spin_lock_irqsave() but we enable
> irqs with spin_unlock_irq(). If the caller of blk_run_queue() had irqs
> disabled and we enable them again this could be a problem, right? Can someone
> more familiar with kernel locking comment?
>
> drivers/block/virtio_blk.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 774c31d..d674977 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -199,8 +199,14 @@ static void do_virtblk_request(struct request_queue *q)
> issued++;
> }
>
> - if (issued)
> - virtqueue_kick(vblk->vq);
> + if (!issued)
> + return;
> +
> + if (virtqueue_kick_prepare(vblk->vq)) {
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + virtqueue_notify(vblk->vq);
If blk_done runs and completes the request at this point,
can hot unplug then remove the queue?
If yes will we get a use after free?
> + spin_lock_irq(vblk->disk->queue->queue_lock);
> + }
> }
>
> /* return id (s/n) string for *disk to *id_str
> --
> 1.7.10
>
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] virtio_blk: unlock vblk->lock during kick
2012-06-01 9:13 [PATCH v3] virtio_blk: unlock vblk->lock during kick Stefan Hajnoczi
2012-06-04 8:33 ` Asias He
2012-06-04 11:11 ` Michael S. Tsirkin
@ 2012-06-04 11:15 ` Michael S. Tsirkin
2012-06-06 9:03 ` Stefan Hajnoczi
2012-06-04 21:13 ` Khoa Huynh
3 siblings, 1 reply; 8+ messages in thread
From: Michael S. Tsirkin @ 2012-06-04 11:15 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: khoa, kvm, virtualization
On Fri, Jun 01, 2012 at 10:13:06AM +0100, Stefan Hajnoczi wrote:
> Other block drivers (cciss, rbd, nbd) use spin_unlock_irq() so I followed that.
> To me this seems wrong: blk_run_queue() uses spin_lock_irqsave() but we enable
> irqs with spin_unlock_irq(). If the caller of blk_run_queue() had irqs
> disabled and we enable them again this could be a problem, right? Can someone
> more familiar with kernel locking comment?
Why take the risk? What's the advantage of enabling them here? VCPU is
not running while the hypervisor is processing the notification anyway.
And the next line returns from the function so the interrupts will get
enabled.
> drivers/block/virtio_blk.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 774c31d..d674977 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -199,8 +199,14 @@ static void do_virtblk_request(struct request_queue *q)
> issued++;
> }
>
> - if (issued)
> - virtqueue_kick(vblk->vq);
> + if (!issued)
> + return;
> +
> + if (virtqueue_kick_prepare(vblk->vq)) {
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + virtqueue_notify(vblk->vq);
> + spin_lock_irq(vblk->disk->queue->queue_lock);
> + }
> }
>
> /* return id (s/n) string for *disk to *id_str
> --
> 1.7.10
>
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] virtio_blk: unlock vblk->lock during kick
2012-06-01 9:13 [PATCH v3] virtio_blk: unlock vblk->lock during kick Stefan Hajnoczi
` (2 preceding siblings ...)
2012-06-04 11:15 ` Michael S. Tsirkin
@ 2012-06-04 21:13 ` Khoa Huynh
3 siblings, 0 replies; 8+ messages in thread
From: Khoa Huynh @ 2012-06-04 21:13 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: virtualization, kvm, Michael S. Tsirkin
[-- Attachment #1.1: Type: text/plain, Size: 5644 bytes --]
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> wrote on 06/01/2012 04:13:06
AM:
>
> Holding the vblk->lock across kick causes poor scalability in SMP
> guests. If one CPU is doing virtqueue kick and another CPU touches the
> vblk->lock it will have to spin until virtqueue kick completes.
>
> This patch reduces system% CPU utilization in SMP guests that are
> running multithreaded I/O-bound workloads. The improvements are small
> but show as iops and SMP are increased.
>
> Khoa Huynh <khoa@us.ibm.com> provided initial performance data that
> indicates this optimization is worthwhile at high iops.
Hi Stefan, I was out on vacation last week, so I could not post the details
of my performance testing with this unlocked-kick patch.
Anyway, here are more details:
Test environment:
- Host: IBM x3850 X5 server (40 cores, 256GB) running RHEL6.2
- Guest: 20 vcpus, 16GB, running Linux 3.4 kernel (w/ and w/o this patch)
- Storage: 12 block devices (from 3 SCSI target servers)
- Workload: FIO benchmark with 8KB random reads and writes (50% split)
With RHEL6.2 qemu-kvm:
- Locked kick = 44,529 IOPS
- Unlocked kick = 44,869 IOPS ==> 0.7% gain
With Stefan's "data-plane" qemu-kvm (experimental qemu to get around
qemu mutex):
- Locked kick = 497,943 IOPS
- Unlocked kick = 532,446 IOPS ==> 6.9% gain
I also tested on a smaller host and smaller guest (4 vcpus, 4GB, 4 virtual
block devices) and applied some performance tuning (e.g. smaller data sets
so most data fit into host's cache, 1KB reads) to deliver the highest I/O
rates possible to the guest with the existing RHEL6.2 qemu-kvm:
- Locked kick = 140,533 IOPS
- Unlocked kick = 143,415 IOPS ==> 2.1% gain
As you can see, the higher the I/O rate, the more performance benefit
we would get from Stefan's unlocked-kick patch. In my performance testing,
the highest I/O rate that I could achieve with the existing KVM/QEMU was
~140,000+ IOPS, so the most performance gain that we could get from this
unlocked-kick patch in this case was about 2-3%. However, going forward,
as we relieve the qemu mutex more and more, we should be able to get much
higher I/O rates (500,000 IOPS or higher), and so I believe that the
performance benefit of this unlocked-kick patch would be more apparent.
If you need further information, please let me know.
Thanks,
-Khoa
>
> Asias He <asias@redhat.com> reports the following fio results:
>
> Host: Linux 3.4.0+ #302 SMP x86_64 GNU/Linux
> Guest: same as host kernel
>
> Average 3 runs:
> with locked kick
> read iops=119907.50 bw=59954.00 runt=35018.50 io=2048.00
> write iops=217187.00 bw=108594.00 runt=19312.00 io=2048.00
> read iops=33948.00 bw=16974.50 runt=186820.50 io=3095.70
> write iops=35014.00 bw=17507.50 runt=181151.00 io=3095.70
> clat (usec) max=3484.10 avg=121085.38 stdev=174416.11 min=0.00
> clat (usec) max=3438.30 avg=59863.35 stdev=116607.69 min=0.00
> clat (usec) max=3745.65 avg=454501.30 stdev=332699.00 min=0.00
> clat (usec) max=4089.75 avg=442374.99 stdev=304874.62 min=0.00
> cpu sys=615.12 majf=24080.50 ctx=64253616.50 usr=68.08
minf=17907363.00
> cpu sys=1235.95 majf=23389.00 ctx=59788148.00 usr=98.34
minf=20020008.50
> cpu sys=764.96 majf=28414.00 ctx=848279274.00 usr=36.39
minf=19737254.00
> cpu sys=714.13 majf=21853.50 ctx=854608972.00 usr=33.56
minf=18256760.50
>
> with unlocked kick
> read iops=118559.00 bw=59279.66 runt=35400.66 io=2048.00
> write iops=227560.00 bw=113780.33 runt=18440.00 io=2048.00
> read iops=34567.66 bw=17284.00 runt=183497.33 io=3095.70
> write iops=34589.33 bw=17295.00 runt=183355.00 io=3095.70
> clat (usec) max=3485.56 avg=121989.58 stdev=197355.15 min=0.00
> clat (usec) max=3222.33 avg=57784.11 stdev=141002.89 min=0.00
> clat (usec) max=4060.93 avg=447098.65 stdev=315734.33 min=0.00
> clat (usec) max=3656.30 avg=447281.70 stdev=314051.33 min=0.00
> cpu sys=683.78 majf=24501.33 ctx=64435364.66 usr=68.91
minf=17907893.33
> cpu sys=1218.24 majf=25000.33 ctx=60451475.00 usr=101.04
minf=19757720.00
> cpu sys=740.39 majf=24809.00 ctx=845290443.66 usr=37.25
minf=19349958.33
> cpu sys=723.63 majf=27597.33 ctx=850199927.33 usr=35.35
minf=19092343.00
>
> FIO config file:
>
> [global]
> exec_prerun="echo 3 > /proc/sys/vm/drop_caches"
> group_reporting
> norandommap
> ioscheduler=noop
> thread
> bs=512
> size=4MB
> direct=1
> filename=/dev/vdb
> numjobs=256
> ioengine=aio
> iodepth=64
> loops=3
>
> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
> ---
> Other block drivers (cciss, rbd, nbd) use spin_unlock_irq() so I
> followed that.
> To me this seems wrong: blk_run_queue() uses spin_lock_irqsave() butwe
enable
> irqs with spin_unlock_irq(). If the caller of blk_run_queue() had irqs
> disabled and we enable them again this could be a problem, right? Can
someone
> more familiar with kernel locking comment?
>
> drivers/block/virtio_blk.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 774c31d..d674977 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -199,8 +199,14 @@ static void do_virtblk_request(struct request_queue
*q)
> issued++;
> }
>
> - if (issued)
> - virtqueue_kick(vblk->vq);
> + if (!issued)
> + return;
> +
> + if (virtqueue_kick_prepare(vblk->vq)) {
> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> + virtqueue_notify(vblk->vq);
> + spin_lock_irq(vblk->disk->queue->queue_lock);
> + }
> }
>
> /* return id (s/n) string for *disk to *id_str
> --
> 1.7.10
>
[-- Attachment #1.2: Type: text/html, Size: 8132 bytes --]
[-- Attachment #2: Type: text/plain, Size: 183 bytes --]
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] virtio_blk: unlock vblk->lock during kick
2012-06-04 11:15 ` Michael S. Tsirkin
@ 2012-06-06 9:03 ` Stefan Hajnoczi
0 siblings, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2012-06-06 9:03 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: khoa, Stefan Hajnoczi, kvm, virtualization
On Mon, Jun 4, 2012 at 12:15 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Fri, Jun 01, 2012 at 10:13:06AM +0100, Stefan Hajnoczi wrote:
>> Other block drivers (cciss, rbd, nbd) use spin_unlock_irq() so I followed that.
>> To me this seems wrong: blk_run_queue() uses spin_lock_irqsave() but we enable
>> irqs with spin_unlock_irq(). If the caller of blk_run_queue() had irqs
>> disabled and we enable them again this could be a problem, right? Can someone
>> more familiar with kernel locking comment?
>
> Why take the risk? What's the advantage of enabling them here? VCPU is
> not running while the hypervisor is processing the notification anyway.
> And the next line returns from the function so the interrupts will get
> enabled.
I agree. After looking through the code more following Asias' call
chain, I'm happy to use spin_unlock() and not worry about the irq
part.
Will fix.
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] virtio_blk: unlock vblk->lock during kick
2012-06-04 11:11 ` Michael S. Tsirkin
@ 2012-06-06 15:25 ` Stefan Hajnoczi
[not found] ` <CAJSP0QWkXeCQKOEMoV6XkNpwbnQczq9Smx85=Tg-73A9fmSyVQ@mail.gmail.com>
1 sibling, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2012-06-06 15:25 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: khoa, Stefan Hajnoczi, kvm, virtualization
On Mon, Jun 4, 2012 at 12:11 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Fri, Jun 01, 2012 at 10:13:06AM +0100, Stefan Hajnoczi wrote:
>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>> index 774c31d..d674977 100644
>> --- a/drivers/block/virtio_blk.c
>> +++ b/drivers/block/virtio_blk.c
>> @@ -199,8 +199,14 @@ static void do_virtblk_request(struct request_queue *q)
>> issued++;
>> }
>>
>> - if (issued)
>> - virtqueue_kick(vblk->vq);
>> + if (!issued)
>> + return;
>> +
>> + if (virtqueue_kick_prepare(vblk->vq)) {
>> + spin_unlock_irq(vblk->disk->queue->queue_lock);
>> + virtqueue_notify(vblk->vq);
>
> If blk_done runs and completes the request at this point,
> can hot unplug then remove the queue?
> If yes will we get a use after free?
This is a difficult question, I haven't been able to decide one way or
another. The use-after-free is the
spin_lock_irq(vblk->disk->queue->queue_lock).
It still doesn't explain why existing drivers are doing this. In nbd,
for example, I can't see anything preventing the same situation in
drivers/block/nbd.c:do_nbd_request() between wake_up() and
spin_lock_irq(q->queue_lock). If the request completes (like in your
example scenario) then the module remove code path has no way of
knowing there is still a thread in do_nbd_request().
Any ideas?
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] virtio_blk: unlock vblk->lock during kick
[not found] ` <CAJSP0QWkXeCQKOEMoV6XkNpwbnQczq9Smx85=Tg-73A9fmSyVQ@mail.gmail.com>
@ 2012-06-08 13:51 ` Michael S. Tsirkin
0 siblings, 0 replies; 8+ messages in thread
From: Michael S. Tsirkin @ 2012-06-08 13:51 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: khoa, Stefan Hajnoczi, kvm, virtualization
On Wed, Jun 06, 2012 at 04:25:55PM +0100, Stefan Hajnoczi wrote:
> On Mon, Jun 4, 2012 at 12:11 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Fri, Jun 01, 2012 at 10:13:06AM +0100, Stefan Hajnoczi wrote:
> >> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> >> index 774c31d..d674977 100644
> >> --- a/drivers/block/virtio_blk.c
> >> +++ b/drivers/block/virtio_blk.c
> >> @@ -199,8 +199,14 @@ static void do_virtblk_request(struct request_queue *q)
> >> issued++;
> >> }
> >>
> >> - if (issued)
> >> - virtqueue_kick(vblk->vq);
> >> + if (!issued)
> >> + return;
> >> +
> >> + if (virtqueue_kick_prepare(vblk->vq)) {
> >> + spin_unlock_irq(vblk->disk->queue->queue_lock);
> >> + virtqueue_notify(vblk->vq);
> >
> > If blk_done runs and completes the request at this point,
> > can hot unplug then remove the queue?
> > If yes will we get a use after free?
>
> This is a difficult question, I haven't been able to decide one way or
> another. The use-after-free is the
> spin_lock_irq(vblk->disk->queue->queue_lock).
>
> It still doesn't explain why existing drivers are doing this. In nbd,
> for example, I can't see anything preventing the same situation in
> drivers/block/nbd.c:do_nbd_request() between wake_up() and
> spin_lock_irq(q->queue_lock). If the request completes (like in your
> example scenario) then the module remove code path has no way of
> knowing there is still a thread in do_nbd_request().
>
> Any ideas?
>
> Stefan
Try posting to a storage-related mailing list and/or lkml.
--
MST
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-06-08 13:51 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-01 9:13 [PATCH v3] virtio_blk: unlock vblk->lock during kick Stefan Hajnoczi
2012-06-04 8:33 ` Asias He
2012-06-04 11:11 ` Michael S. Tsirkin
2012-06-06 15:25 ` Stefan Hajnoczi
[not found] ` <CAJSP0QWkXeCQKOEMoV6XkNpwbnQczq9Smx85=Tg-73A9fmSyVQ@mail.gmail.com>
2012-06-08 13:51 ` Michael S. Tsirkin
2012-06-04 11:15 ` Michael S. Tsirkin
2012-06-06 9:03 ` Stefan Hajnoczi
2012-06-04 21:13 ` Khoa Huynh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).