From: Stefan Hajnoczi <stefanha@gmail.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: borntraeger@de.ibm.com, qemu-devel@nongnu.org, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH] queue: fix QSLIST_INSERT_HEAD_ATOMIC race
Date: Thu, 12 Mar 2015 13:34:12 +0000 [thread overview]
Message-ID: <20150312133412.GS10493@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <1426002357-6889-1-git-send-email-pbonzini@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3992 bytes --]
On Tue, Mar 10, 2015 at 04:45:57PM +0100, Paolo Bonzini wrote:
> There is a not-so-subtle race in QSLIST_INSERT_HEAD_ATOMIC.
>
> Because atomic_cmpxchg returns the old value instead of a success flag,
> QSLIST_INSERT_HEAD_ATOMIC was checking for success by comparing against
> the second argument to atomic_cmpxchg. Unfortunately, this only works
> if the second argument is a local or thread-local variable.
>
> If it is in memory, it can be subject to common subexpression elimination
> (and then everything's fine) or reloaded after the atomic_cmpxchg,
> depending on the compiler's whims. If the latter happens, the race can
> happen. A thread can sneak in, doing something on elm->field.sle_next
> after the atomic_cmpxchg and before the comparison. This causes a wrong
> failure, and then two threads are using "elm" at the same time. In the
> case discovered by Christian, the sequence was likely something like this:
>
> thread 1 | thread 2
> QSLIST_INSERT_HEAD_ATOMIC |
> atomic_cmpxchg succeeds |
> elm added to list |
> | steal release_pool
> | QSLIST_REMOVE_HEAD
> | elm removed from list
> | ...
> | QSLIST_INSERT_HEAD_ATOMIC
> | (overwrites sle_next)
> spurious failure |
> atomic_cmpxchg succeeds |
> elm added to list again |
> |
> steal release_pool |
> QSLIST_REMOVE_HEAD |
> elm removed again |
>
> The last three steps could be done by a third thread as well.
> A reproducer that failed in a matter of seconds is as follows:
>
> - the guest has 32 VCPUs on a 28 core host (hyperthreading was enabled),
> memory was 16G just to err on the safe side (the host has 64G, but hey
> at least you need no s390)
>
> - the guest has 24 null-aio virtio-blk devices using dataplane
> (-object iothread,id=ioN -drive if=none,id=blkN,driver=null-aio,size=500G
> -device virtio-blk-pci,iothread=ioN,drive=blkN)
>
> - the guest also has a single network interface. It's only doing loopback
> tests so slirp vs. tap and the model doesn't matter.
>
> - the guest is running fio with the following script:
>
> [global]
> rw=randread
> blocksize=16k
> ioengine=libaio
> runtime=10m
> buffered=0
> fallocate=none
> time_based
> iodepth=32
>
> [virtio1a]
> filename=/dev/block/252\:16
>
> [virtio1b]
> filename=/dev/block/252\:16
>
> ...
>
> [virtio24a]
> filename=/dev/block/252\:384
>
> [virtio24b]
> filename=/dev/block/252\:384
>
> [listen1]
> protocol=tcp
> ioengine=net
> port=12345
> listen
> rw=read
> bs=4k
> size=1000g
>
> [connect1]
> protocol=tcp
> hostname=localhost
> ioengine=net
> port=12345
> protocol=tcp
> rw=write
> startdelay=1
> size=1000g
>
> ...
>
> [listen8]
> protocol=tcp
> ioengine=net
> port=12352
> listen
> rw=read
> bs=4k
> size=1000g
>
> [connect8]
> protocol=tcp
> hostname=localhost
> ioengine=net
> port=12352
> rw=write
> startdelay=1
> size=1000g
>
> Moral of the story: I should refrain from writing more clever stuff.
> At least it looks like it is not too clever to be undebuggable.
>
> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Fixes: c740ad92d0d958fa785e5d7aa1b67ecaf30a6a54
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> include/qemu/queue.h | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
next prev parent reply other threads:[~2015-03-12 13:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-10 15:45 [Qemu-devel] [PATCH] queue: fix QSLIST_INSERT_HEAD_ATOMIC race Paolo Bonzini
2015-03-10 22:51 ` Christian Borntraeger
2015-03-12 13:34 ` Stefan Hajnoczi [this message]
2015-03-12 13:35 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150312133412.GS10493@stefanha-thinkpad.redhat.com \
--to=stefanha@gmail.com \
--cc=borntraeger@de.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).