qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: borntraeger@de.ibm.com, qemu-devel@nongnu.org, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH] queue: fix QSLIST_INSERT_HEAD_ATOMIC race
Date: Thu, 12 Mar 2015 13:34:12 +0000	[thread overview]
Message-ID: <20150312133412.GS10493@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <1426002357-6889-1-git-send-email-pbonzini@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3992 bytes --]

On Tue, Mar 10, 2015 at 04:45:57PM +0100, Paolo Bonzini wrote:
> There is a not-so-subtle race in QSLIST_INSERT_HEAD_ATOMIC.
> 
> Because atomic_cmpxchg returns the old value instead of a success flag,
> QSLIST_INSERT_HEAD_ATOMIC was checking for success by comparing against
> the second argument to atomic_cmpxchg.  Unfortunately, this only works
> if the second argument is a local or thread-local variable.
> 
> If it is in memory, it can be subject to common subexpression elimination
> (and then everything's fine) or reloaded after the atomic_cmpxchg,
> depending on the compiler's whims.  If the latter happens, the race can
> happen.  A thread can sneak in, doing something on elm->field.sle_next
> after the atomic_cmpxchg and before the comparison.  This causes a wrong
> failure, and then two threads are using "elm" at the same time.  In the
> case discovered by Christian, the sequence was likely something like this:
> 
>     thread 1                   | thread 2
>     QSLIST_INSERT_HEAD_ATOMIC  |
>       atomic_cmpxchg succeeds  |
>       elm added to list        |
>                                | steal release_pool
>                                | QSLIST_REMOVE_HEAD
>                                | elm removed from list
>                                | ...
>                                | QSLIST_INSERT_HEAD_ATOMIC
>                                |   (overwrites sle_next)
>       spurious failure         |
>       atomic_cmpxchg succeeds  |
>       elm added to list again  |
>                                |
>     steal release_pool         |
>     QSLIST_REMOVE_HEAD         |
>     elm removed again          |
> 
> The last three steps could be done by a third thread as well.
> A reproducer that failed in a matter of seconds is as follows:
> 
> - the guest has 32 VCPUs on a 28 core host (hyperthreading was enabled),
>   memory was 16G just to err on the safe side (the host has 64G, but hey
>   at least you need no s390)
> 
> - the guest has 24 null-aio virtio-blk devices using dataplane
>   (-object iothread,id=ioN -drive if=none,id=blkN,driver=null-aio,size=500G
>   -device virtio-blk-pci,iothread=ioN,drive=blkN)
> 
> - the guest also has a single network interface.  It's only doing loopback
>   tests so slirp vs. tap and the model doesn't matter.
> 
> - the guest is running fio with the following script:
> 
>      [global]
>      rw=randread
>      blocksize=16k
>      ioengine=libaio
>      runtime=10m
>      buffered=0
>      fallocate=none
>      time_based
>      iodepth=32
> 
>      [virtio1a]
>      filename=/dev/block/252\:16
> 
>      [virtio1b]
>      filename=/dev/block/252\:16
> 
>      ...
> 
>      [virtio24a]
>      filename=/dev/block/252\:384
> 
>      [virtio24b]
>      filename=/dev/block/252\:384
> 
>      [listen1]
>      protocol=tcp
>      ioengine=net
>      port=12345
>      listen
>      rw=read
>      bs=4k
>      size=1000g
> 
>      [connect1]
>      protocol=tcp
>      hostname=localhost
>      ioengine=net
>      port=12345
>      protocol=tcp
>      rw=write
>      startdelay=1
>      size=1000g
> 
>      ...
> 
>      [listen8]
>      protocol=tcp
>      ioengine=net
>      port=12352
>      listen
>      rw=read
>      bs=4k
>      size=1000g
> 
>      [connect8]
>      protocol=tcp
>      hostname=localhost
>      ioengine=net
>      port=12352
>      rw=write
>      startdelay=1
>      size=1000g
> 
> Moral of the story: I should refrain from writing more clever stuff.
> At least it looks like it is not too clever to be undebuggable.
> 
> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Fixes: c740ad92d0d958fa785e5d7aa1b67ecaf30a6a54
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qemu/queue.h | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

  parent reply	other threads:[~2015-03-12 13:34 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-10 15:45 [Qemu-devel] [PATCH] queue: fix QSLIST_INSERT_HEAD_ATOMIC race Paolo Bonzini
2015-03-10 22:51 ` Christian Borntraeger
2015-03-12 13:34 ` Stefan Hajnoczi [this message]
2015-03-12 13:35 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150312133412.GS10493@stefanha-thinkpad.redhat.com \
    --to=stefanha@gmail.com \
    --cc=borntraeger@de.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).