[Qemu-devel] [RFC 00/13] dataplane: use block layer

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [RFC 00/13] dataplane: use block layer
@ 2013-06-14  9:48 Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 01/13] block: fix bdrv_flush() ordering in bdrv_close() Stefan Hajnoczi
                   ` (12 more replies)
  0 siblings, 13 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

This series adds image format, QMP 'transation', and QMP 'block_resize' support
to dataplane.  This is done by replacing custom Linux AIO code with QEMU block
layer APIs.

In order for block layer APIs to be safe, bdrv_drain_all() is modified to be
aware of device emulation threads (like dataplane).
BlockDevOps->drain_threads_cb() is introduced as a way to stop device emulation
threads temporarily.  Device emulation threads may resume after this main loop
iteration completes, which is enough to allow code running under the QEMU
global mutex a chance to use the block device temporarily.

bdrv_get_aio_context() is dropped in favor of a thread-local AioContext
pointer.  This allows qemu_bh_new(), qemu_aio_wait(), and friends to
automatically use the right AioContext.  When we introduce a BlockDriverState
lock in the future, it will be possible to use block devices from multiple
threads arbitrarily (right now we only allow it if bdrv_drain_all() is used).

Three open issues:

 * Timers only work in the main loop, so I/O throttling is ignored and QED is
   unsafe with x-data-plane=on.

 * Need to test with NBD, Gluster, and other non-file protocols.

 * Need to compare performance against custom Linux AIO code.

Paolo Bonzini (2):
  exec: do not use qemu/tls.h
  qemu-thread: add TLS wrappers

Stefan Hajnoczi (11):
  block: fix bdrv_flush() ordering in bdrv_close()
  dataplane: sync virtio.c and vring.c virtqueue state
  block: add BlockDevOps->drain_threads_cb()
  virtio-blk: implement BlockDevOps->drain_threads_cb()
  block: add thread_aio_context TLS variable
  block: drop bdrv_get_aio_context()
  main-loop: use thread-local AioContext
  block: disable I/O throttling outside main loop
  dataplane: use block layer for I/O
  dataplane: drop ioq Linux AIO request queue
  block: drop raw_get_aio_fd()

 async.c                             |   2 +
 block.c                             |  22 ++-
 block/raw-posix.c                   |  38 +----
 block/raw-win32.c                   |   2 +-
 configure                           |  21 +++
 cpus.c                              |   2 +
 exec.c                              |  10 +-
 hw/block/dataplane/Makefile.objs    |   2 +-
 hw/block/dataplane/ioq.c            | 117 ---------------
 hw/block/dataplane/ioq.h            |  57 -------
 hw/block/dataplane/virtio-blk.c     | 290 ++++++++++++------------------------
 hw/block/virtio-blk.c               |  12 ++
 hw/virtio/dataplane/vring.c         |   8 +-
 include/block/aio.h                 |   4 +
 include/block/block.h               |  17 ++-
 include/block/block_int.h           |   7 -
 include/exec/cpu-all.h              |  14 +-
 include/hw/virtio/dataplane/vring.h |   2 +-
 include/qemu/tls.h                  | 125 +++++++++++++---
 main-loop.c                         |  15 +-
 tests/Makefile                      |   3 +
 tests/test-tls.c                    |  87 +++++++++++
 util/qemu-thread-win32.c            |  17 +++
 23 files changed, 408 insertions(+), 466 deletions(-)
 delete mode 100644 hw/block/dataplane/ioq.c
 delete mode 100644 hw/block/dataplane/ioq.h
 create mode 100644 tests/test-tls.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 01/13] block: fix bdrv_flush() ordering in bdrv_close()
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 02/13] dataplane: sync virtio.c and vring.c virtqueue state Stefan Hajnoczi
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

Since 80ccf93b we flush the block device during close.  The
bdrv_drain_all() call should come before bdrv_flush() to ensure guest
write requests have completed.  Otherwise we may miss pending writes
when flushing.

However, there is still a slight change that cancelling a blockjob or
doing bdrv_flush() might leave an I/O request running, so call
bdrv_drain_all() again for safety as the final step.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 79ad33d..1c9df27 100644
--- a/block.c
+++ b/block.c
@@ -1357,11 +1357,12 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 
 void bdrv_close(BlockDriverState *bs)
 {
-    bdrv_flush(bs);
+    bdrv_drain_all(); /* complete guest I/O */
     if (bs->job) {
         block_job_cancel_sync(bs->job);
     }
-    bdrv_drain_all();
+    bdrv_flush(bs);
+    bdrv_drain_all(); /* in case blockjob or flush started I/O */
     notifier_list_notify(&bs->close_notifiers, bs);
 
     if (bs->drv) {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 02/13] dataplane: sync virtio.c and vring.c virtqueue state
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 01/13] block: fix bdrv_flush() ordering in bdrv_close() Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 03/13] block: add BlockDevOps->drain_threads_cb() Stefan Hajnoczi
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

Load the virtio.c state into vring.c when we start dataplane mode and
vice versa when stopping dataplane mode.  This patch makes it possible
to start and stop dataplane any time while the guest is running.

This is very useful since it will allow us to go back to QEMU main loop
for bdrv_drain_all() and live migration.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/dataplane/virtio-blk.c     | 2 +-
 hw/virtio/dataplane/vring.c         | 8 +++++---
 include/hw/virtio/dataplane/vring.h | 2 +-
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 0356665..2faed43 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -537,7 +537,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
     /* Clean up guest notifier (irq) */
     k->set_guest_notifiers(qbus->parent, 1, false);
 
-    vring_teardown(&s->vring);
+    vring_teardown(&s->vring, s->vdev, 0);
     s->started = false;
     s->stopping = false;
 }
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index e0d6e83..82cc151 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -39,8 +39,8 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
 
     vring_init(&vring->vr, virtio_queue_get_num(vdev, n), vring_ptr, 4096);
 
-    vring->last_avail_idx = 0;
-    vring->last_used_idx = 0;
+    vring->last_avail_idx = virtio_queue_get_last_avail_idx(vdev, n);
+    vring->last_used_idx = vring->vr.used->idx;
     vring->signalled_used = 0;
     vring->signalled_used_valid = false;
 
@@ -49,8 +49,10 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
     return true;
 }
 
-void vring_teardown(Vring *vring)
+void vring_teardown(Vring *vring, VirtIODevice *vdev, int n)
 {
+    virtio_queue_set_last_avail_idx(vdev, n, vring->last_avail_idx);
+
     hostmem_finalize(&vring->hostmem);
 }
 
diff --git a/include/hw/virtio/dataplane/vring.h b/include/hw/virtio/dataplane/vring.h
index 9380cb5..c0b69ff 100644
--- a/include/hw/virtio/dataplane/vring.h
+++ b/include/hw/virtio/dataplane/vring.h
@@ -50,7 +50,7 @@ static inline void vring_set_broken(Vring *vring)
 }
 
 bool vring_setup(Vring *vring, VirtIODevice *vdev, int n);
-void vring_teardown(Vring *vring);
+void vring_teardown(Vring *vring, VirtIODevice *vdev, int n);
 void vring_disable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_enable_notification(VirtIODevice *vdev, Vring *vring);
 bool vring_should_notify(VirtIODevice *vdev, Vring *vring);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 03/13] block: add BlockDevOps->drain_threads_cb()
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 01/13] block: fix bdrv_flush() ordering in bdrv_close() Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 02/13] dataplane: sync virtio.c and vring.c virtqueue state Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 04/13] virtio-blk: implement BlockDevOps->drain_threads_cb() Stefan Hajnoczi
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

There are times when the QEMU main loop wishes to drain I/O requests.
Up until now bdrv_drain_all() meant that no new guest I/O will be
processed until the next event loop iteration.

This is no longer true with dataplane since it runs outside the QEMU
global mutex.  The BlockDevOps->drain_threads_cb() interface allows the
device model to drain and stop threads.  Once draining completes, the
QEMU main loop can be sure that no further I/O will take place until
next main loop iteration.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.c               | 6 ++++++
 include/block/block.h | 9 +++++++++
 2 files changed, 15 insertions(+)

diff --git a/block.c b/block.c
index 1c9df27..a3323b2 100644
--- a/block.c
+++ b/block.c
@@ -1430,6 +1430,12 @@ void bdrv_drain_all(void)
     BlockDriverState *bs;
     bool busy;
 
+    QTAILQ_FOREACH(bs, &bdrv_states, list) {
+        if (bs->dev_ops && bs->dev_ops->drain_threads_cb) {
+            bs->dev_ops->drain_threads_cb(bs->dev_opaque);
+        }
+    }
+
     do {
         busy = qemu_aio_wait();
 
diff --git a/include/block/block.h b/include/block/block.h
index 2307f67..85147bb 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -60,6 +60,15 @@ typedef struct BlockDevOps {
      * Runs when the size changed (e.g. monitor command block_resize)
      */
     void (*resize_cb)(void *opaque);
+    /*
+     * Notifies the device model to drain any emulation threads
+     *
+     * Upon return, there must be no new or pending requests outside the QEMU
+     * main loop.  The device model may restart emulation threads after this
+     * main loop iteration, for example in a hardware register handler
+     * function.
+     */
+    void (*drain_threads_cb)(void *opaque);
 } BlockDevOps;
 
 #define BDRV_O_RDWR        0x0002
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 04/13] virtio-blk: implement BlockDevOps->drain_threads_cb()
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 03/13] block: add BlockDevOps->drain_threads_cb() Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14 14:23   ` Paolo Bonzini
  2013-06-14  9:48 ` [Qemu-devel] [RFC 05/13] exec: do not use qemu/tls.h Stefan Hajnoczi
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

Drain and stop the dataplane thread when bdrv_drain_all() is called.
Note that the thread will be restarted in virtio_blk_handle_output() the
next time the guest kicks the virtqueue.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/virtio-blk.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index cf12469..f9c2b79 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -618,8 +618,20 @@ static void virtio_blk_resize(void *opaque)
     virtio_notify_config(vdev);
 }
 
+static void virtio_blk_drain_threads(void *opaque)
+{
+#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
+    VirtIOBlock *s = VIRTIO_BLK(opaque);
+
+    if (s->dataplane) {
+        virtio_blk_data_plane_stop(s->dataplane);
+    }
+#endif
+}
+
 static const BlockDevOps virtio_block_ops = {
     .resize_cb = virtio_blk_resize,
+    .drain_threads_cb = virtio_blk_drain_threads,
 };
 
 void virtio_blk_set_conf(DeviceState *dev, VirtIOBlkConf *blk)
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 05/13] exec: do not use qemu/tls.h
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 04/13] virtio-blk: implement BlockDevOps->drain_threads_cb() Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers Stefan Hajnoczi
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu

From: Paolo Bonzini <pbonzini@redhat.com>

The next patch will change qemu/tls.h to support more platforms, but at
some performance cost.  Declare cpu_single_env directly instead of using
the tls.h abstractions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                 | 10 ++++++++--
 include/exec/cpu-all.h | 14 +++++++++++---
 include/qemu/tls.h     | 52 --------------------------------------------------
 3 files changed, 19 insertions(+), 57 deletions(-)
 delete mode 100644 include/qemu/tls.h

diff --git a/exec.c b/exec.c
index 5b8b40d..10d97e0 100644
--- a/exec.c
+++ b/exec.c
@@ -71,9 +71,15 @@ static MemoryRegion io_mem_unassigned, io_mem_subpage_ram;
 #endif
 
 CPUArchState *first_cpu;
+
 /* current CPU in the current thread. It is only valid inside
-   cpu_exec() */
-DEFINE_TLS(CPUArchState *,cpu_single_env);
+ * cpu_exec().  See comment in include/exec/cpu-all.h.  */
+#if defined CONFIG_KVM || (defined CONFIG_USER_ONLY && defined CONFIG_USE_NPTL)
+__thread CPUArchState *cpu_single_env;
+#else
+CPUArchState *cpu_single_env;
+#endif
+
 /* 0 = Do not count executed instructions.
    1 = Precise instruction counting.
    2 = Adaptive rate instruction counting.  */
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index e9c3717..2202ba3 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -20,7 +20,6 @@
 #define CPU_ALL_H
 
 #include "qemu-common.h"
-#include "qemu/tls.h"
 #include "exec/cpu-common.h"
 #include "qemu/thread.h"
 
@@ -368,8 +367,17 @@ void cpu_dump_statistics(CPUArchState *env, FILE *f, fprintf_function cpu_fprint
 void QEMU_NORETURN cpu_abort(CPUArchState *env, const char *fmt, ...)
     GCC_FMT_ATTR(2, 3);
 extern CPUArchState *first_cpu;
-DECLARE_TLS(CPUArchState *,cpu_single_env);
-#define cpu_single_env tls_var(cpu_single_env)
+
+/* This is thread-local depending on __linux__ because:
+ *  - the only -user mode supporting multiple VCPU threads is linux-user
+ *  - TCG system mode is single-threaded regarding VCPUs
+ *  - KVM system mode is multi-threaded but limited to Linux
+ */
+#if defined CONFIG_KVM || (defined CONFIG_USER_ONLY && defined CONFIG_USE_NPTL)
+extern __thread CPUArchState *cpu_single_env;
+#else
+extern CPUArchState *cpu_single_env;
+#endif
 
 /* Flags for use in ENV->INTERRUPT_PENDING.
 
diff --git a/include/qemu/tls.h b/include/qemu/tls.h
deleted file mode 100644
index b92ea9d..0000000
--- a/include/qemu/tls.h
+++ /dev/null
@@ -1,52 +0,0 @@
-/*
- * Abstraction layer for defining and using TLS variables
- *
- * Copyright (c) 2011 Red Hat, Inc
- * Copyright (c) 2011 Linaro Limited
- *
- * Authors:
- *  Paolo Bonzini <pbonzini@redhat.com>
- *  Peter Maydell <peter.maydell@linaro.org>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation; either version 2 of
- * the License, or (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License along
- * with this program; if not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef QEMU_TLS_H
-#define QEMU_TLS_H
-
-/* Per-thread variables. Note that we only have implementations
- * which are really thread-local on Linux; the dummy implementations
- * define plain global variables.
- *
- * This means that for the moment use should be restricted to
- * per-VCPU variables, which are OK because:
- *  - the only -user mode supporting multiple VCPU threads is linux-user
- *  - TCG system mode is single-threaded regarding VCPUs
- *  - KVM system mode is multi-threaded but limited to Linux
- *
- * TODO: proper implementations via Win32 .tls sections and
- * POSIX pthread_getspecific.
- */
-#ifdef __linux__
-#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
-#define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
-#define tls_var(x)           tls__##x
-#else
-/* Dummy implementations which define plain global variables */
-#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
-#define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
-#define tls_var(x)           tls__##x
-#endif
-
-#endif
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 05/13] exec: do not use qemu/tls.h Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-20  7:26   ` Fam Zheng
  2013-06-14  9:48 ` [Qemu-devel] [RFC 07/13] block: add thread_aio_context TLS variable Stefan Hajnoczi
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu

From: Paolo Bonzini <pbonzini@redhat.com>

Fast TLS is not available on some platforms, but it is always nice to
use it.  This wrapper implementation falls back to pthread_get/setspecific
on POSIX systems that lack __thread, but uses the dynamic linker's TLS
support on Linux and Windows.

The user shall call alloc_foo() in every thread that needs to access the
variable---exactly once and before any access.  foo is the name of the
variable as passed to DECLARE_TLS and DEFINE_TLS.  Then, get_foo() will
return the address of the variable.  It is guaranteed to remain the same
across the lifetime of a thread, so you can cache it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 configure                |  21 +++++++
 include/qemu/tls.h       | 139 +++++++++++++++++++++++++++++++++++++++++++++++
 tests/Makefile           |   3 +
 tests/test-tls.c         |  87 +++++++++++++++++++++++++++++
 util/qemu-thread-win32.c |  17 ++++++
 5 files changed, 267 insertions(+)
 create mode 100644 include/qemu/tls.h
 create mode 100644 tests/test-tls.c

diff --git a/configure b/configure
index 1654413..c679386 100755
--- a/configure
+++ b/configure
@@ -284,6 +284,7 @@ fi
 ar="${AR-${cross_prefix}ar}"
 as="${AS-${cross_prefix}as}"
 cpp="${CPP-$cc -E}"
+nm="${NM-${cross_prefix}nm}"
 objcopy="${OBJCOPY-${cross_prefix}objcopy}"
 ld="${LD-${cross_prefix}ld}"
 libtool="${LIBTOOL-${cross_prefix}libtool}"
@@ -3157,6 +3158,22 @@ if test "$trace_backend" = "dtrace"; then
 fi
 
 ##########################################
+# check for TLS runtime
+
+# Some versions of mingw include the "magic" definitions that make
+# TLS work, some don't.  Check for it.
+
+if test "$mingw32" = yes; then
+  cat > $TMPC << EOF
+int main(void) {}
+EOF
+  compile_prog "" ""
+  if $nm $TMPE | grep _tls_used > /dev/null 2>&1; then
+    mingw32_tls_runtime=yes
+  fi
+fi
+
+##########################################
 # check and set a backend for coroutine
 
 # We prefer ucontext, but it's not always possible. The fallback
@@ -3615,6 +3632,9 @@ if test "$mingw32" = "yes" ; then
   version_micro=0
   echo "CONFIG_FILEVERSION=$version_major,$version_minor,$version_subminor,$version_micro" >> $config_host_mak
   echo "CONFIG_PRODUCTVERSION=$version_major,$version_minor,$version_subminor,$version_micro" >> $config_host_mak
+  if test "$mingw32_tls_runtime" = yes; then
+    echo "CONFIG_MINGW32_TLS_RUNTIME=y" >> $config_host_mak
+  fi
 else
   echo "CONFIG_POSIX=y" >> $config_host_mak
 fi
@@ -4040,6 +4060,7 @@ echo "OBJCC=$objcc" >> $config_host_mak
 echo "AR=$ar" >> $config_host_mak
 echo "AS=$as" >> $config_host_mak
 echo "CPP=$cpp" >> $config_host_mak
+echo "NM=$nm" >> $config_host_mak
 echo "OBJCOPY=$objcopy" >> $config_host_mak
 echo "LD=$ld" >> $config_host_mak
 echo "WINDRES=$windres" >> $config_host_mak
diff --git a/include/qemu/tls.h b/include/qemu/tls.h
new file mode 100644
index 0000000..7fd0632
--- /dev/null
+++ b/include/qemu/tls.h
@@ -0,0 +1,139 @@
+/*
+ * Abstraction layer for defining and using TLS variables
+ *
+ * Copyright (c) 2011, 2013 Red Hat, Inc
+ * Copyright (c) 2011 Linaro Limited
+ *
+ * Authors:
+ *  Paolo Bonzini <pbonzini@redhat.com>
+ *  Peter Maydell <peter.maydell@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef QEMU_TLS_H
+#define QEMU_TLS_H
+
+#if defined __linux__
+#define DECLARE_TLS(type, x)                 \
+extern __thread typeof(type) x;              \
+                                             \
+static inline typeof(type) *get_##x(void)    \
+{                                            \
+    return &x;                               \
+}                                            \
+                                             \
+static inline typeof(type) *alloc_##x(void)  \
+{                                            \
+    return &x;                               \
+}                                            \
+                                             \
+extern int dummy_##__LINE__
+
+#define DEFINE_TLS(type, x)                  \
+__thread typeof(type) x
+
+#elif defined CONFIG_POSIX
+typedef struct QEMUTLSValue {
+    pthread_key_t k;
+    pthread_once_t o;
+} QEMUTLSValue;
+
+#define DECLARE_TLS(type, x)                 \
+extern QEMUTLSValue x;                       \
+extern void init_##x(void);                  \
+                                             \
+static inline typeof(type) *get_##x(void)    \
+{                                            \
+    return pthread_getspecific(x.k);         \
+}                                            \
+                                             \
+static inline typeof(type) *alloc_##x(void)  \
+{                                            \
+    void *datum = g_malloc0(sizeof(type));   \
+    pthread_once(&x.o, init_##x);            \
+    pthread_setspecific(x.k, datum);         \
+    return datum;                            \
+}                                            \
+                                             \
+extern int dummy_##__LINE__
+
+#define DEFINE_TLS(type, x)                  \
+void init_##x(void) {                        \
+    pthread_key_create(&x.k, g_free);        \
+}                                            \
+                                             \
+QEMUTLSValue x = { .o = PTHREAD_ONCE_INIT }
+
+#elif defined CONFIG_WIN32
+
+/* The initial contents of TLS variables are placed in the .tls section.
+ * The linker takes all section starting with ".tls$", sorts them and puts
+ * the contents in a single ".tls" section.  qemu-thread-win32.c defines
+ * special symbols in .tls$000 and .tls$ZZZ that represent the beginning
+ * and end of TLS memory.  The linker and run-time library then cooperate
+ * to copy memory between those symbols in the TLS area of new threads.
+ *
+ * _tls_index holds the number of our module.  The executable should be
+ * zero, DLLs are numbered 1 and up.  The loader fills it in for us.
+ *
+ * Thus, Teb->ThreadLocalStoragePointer[_tls_index] is the base of
+ * the TLS segment for this (thread, module) pair.  Each segment has
+ * the same layout as this module's .tls segment and is initialized
+ * with the content of the .tls segment; 0 is the _tls_start variable.
+ * So, get_##x passes us the offset of the passed variable relative to
+ * _tls_start, and we return that same offset plus the base of segment.
+ */
+
+typedef struct _TEB {
+    NT_TIB NtTib;
+    void *EnvironmentPointer;
+    void *x[3];
+    char **ThreadLocalStoragePointer;
+} TEB, *PTEB;
+
+extern int _tls_index;
+extern int _tls_start;
+
+static inline void *tls_var(size_t offset)
+{
+    PTEB Teb = NtCurrentTeb();
+    return (char *)(Teb->ThreadLocalStoragePointer[_tls_index]) + offset;
+}
+
+#define DECLARE_TLS(type, x)                                         \
+extern typeof(type) tls_##x __attribute__((section(".tls$QEMU")));   \
+                                                                     \
+static inline typeof(type) *get_##x(void)                            \
+{                                                                    \
+    return tls_var((ULONG_PTR)&(tls_##x) - (ULONG_PTR)&_tls_start);  \
+}                                                                    \
+                                                                     \
+static inline typeof(type) *alloc_##x(void)                          \
+{                                                                    \
+    typeof(type) *addr = get_##x();                                  \
+    memset((void *)addr, 0, sizeof(type));                           \
+    return addr;                                                     \
+}                                                                    \
+                                                                     \
+extern int dummy_##__LINE__
+
+#define DEFINE_TLS(type, x)                                          \
+typeof(type) tls_##x __attribute__((section(".tls$QEMU")))
+
+#else
+#error No TLS abstraction available on this platform
+#endif
+
+#endif
diff --git a/tests/Makefile b/tests/Makefile
index c107489..1f5156a 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -42,6 +42,8 @@ check-unit-y += tests/test-xbzrle$(EXESUF)
 gcov-files-test-xbzrle-y = xbzrle.c
 check-unit-y += tests/test-cutils$(EXESUF)
 gcov-files-test-cutils-y += util/cutils.c
+check-unit-y += tests/test-tls$(EXESUF)
+gcov-files-test-tls-y +=
 check-unit-y += tests/test-mul64$(EXESUF)
 gcov-files-test-mul64-y = util/host-utils.c
 
@@ -98,6 +100,7 @@ tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o libqemuutil.a libqemustub.a
 tests/test-x86-cpuid$(EXESUF): tests/test-x86-cpuid.o
 tests/test-xbzrle$(EXESUF): tests/test-xbzrle.o xbzrle.o page_cache.o libqemuutil.a
 tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
+tests/test-tls$(EXESUF): tests/test-tls.o libqemuutil.a
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py
diff --git a/tests/test-tls.c b/tests/test-tls.c
new file mode 100644
index 0000000..54e981d
--- /dev/null
+++ b/tests/test-tls.c
@@ -0,0 +1,87 @@
+/*
+ * Unit-tests for TLS wrappers
+ *
+ * Copyright (C) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Paolo Bonzini <pbonzini@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <glib.h>
+#include <errno.h>
+#include <string.h>
+
+#include "qemu-common.h"
+#include "qemu/atomic.h"
+#include "qemu/thread.h"
+#include "qemu/tls.h"
+
+DECLARE_TLS(volatile long long, cnt);
+DEFINE_TLS(volatile long long, cnt);
+
+#define NUM_THREADS 10
+
+int stop;
+
+static void *test_thread(void *arg)
+{
+    volatile long long *p_cnt = alloc_cnt();
+    volatile long long **p_ret = arg;
+    long long exp = 0;
+
+    g_assert(get_cnt() == p_cnt);
+    *p_ret = p_cnt;
+    g_assert(*p_cnt == 0);
+    while (atomic_mb_read(&stop) == 0) {
+        exp++;
+        (*p_cnt)++;
+        g_assert(*get_cnt() == exp);
+    }
+
+    return NULL;
+}
+
+static void test_tls(void)
+{
+    volatile long long *addr[NUM_THREADS];
+    QemuThread t[NUM_THREADS];
+    int i;
+
+    for (i = 0; i < NUM_THREADS; i++) {
+        qemu_thread_create(&t[i], test_thread, &addr[i], QEMU_THREAD_JOINABLE);
+    }
+    g_usleep(1000000);
+    atomic_mb_set(&stop, 1);
+    for (i = 0; i < NUM_THREADS; i++) {
+        qemu_thread_join(&t[i]);
+    }
+    for (i = 1; i < NUM_THREADS; i++) {
+        g_assert(addr[i] != addr[i - 1]);
+    }
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    g_test_add_func("/tls", test_tls);
+    return g_test_run();
+}
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 517878d..f75e404 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -16,6 +16,23 @@
 #include <assert.h>
 #include <limits.h>
 
+/* TLS support.  Some versions of mingw32 provide it, others do not.  */
+
+#ifndef CONFIG_MINGW32_TLS_RUNTIME
+int __attribute__((section(".tls$AAA"))) _tls_start = 0;
+int __attribute__((section(".tls$ZZZ"))) _tls_end = 0;
+int _tls_index = 0;
+
+const IMAGE_TLS_DIRECTORY _tls_used __attribute__((used, section(".rdata$T"))) = {
+ (ULONG)(ULONG_PTR) &_tls_start, /* start of tls data */
+ (ULONG)(ULONG_PTR) &_tls_end,   /* end of tls data */
+ (ULONG)(ULONG_PTR) &_tls_index, /* address of tls_index */
+ (ULONG) 0,                      /* pointer to callbacks */
+ (ULONG) 0,                      /* size of tls zero fill */
+ (ULONG) 0                       /* characteristics */
+};
+#endif
+
 static void error_exit(int err, const char *msg)
 {
     char *pstr;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 07/13] block: add thread_aio_context TLS variable
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (5 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context() Stefan Hajnoczi
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

BH and fd handler APIs need to know which AioContext (event loop) to run
inside.  Passing it around everywhere is not feasible since it would
require adding arguments to any call-chain that invokes BH and fd
handler APIs (hint: there are many and they change).

Instead make the AioContext pointer thread-local.  This way, any
function that needs to use an AioContext can find it.  In particular:
the iothread and vcpu threads share the main AioContext since they hold
the global mutex while running.  Dataplane threads will use their own
AioContexts.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 async.c             | 2 ++
 cpus.c              | 2 ++
 include/block/aio.h | 4 ++++
 main-loop.c         | 1 +
 4 files changed, 9 insertions(+)

diff --git a/async.c b/async.c
index 90fe906..765cbc0 100644
--- a/async.c
+++ b/async.c
@@ -27,6 +27,8 @@
 #include "block/thread-pool.h"
 #include "qemu/main-loop.h"
 
+DEFINE_TLS(AioContext*, thread_aio_context);
+
 /***********************************************************/
 /* bottom halves (can be seen as timers which expire ASAP) */
 
diff --git a/cpus.c b/cpus.c
index c232265..529d612 100644
--- a/cpus.c
+++ b/cpus.c
@@ -741,6 +741,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
     qemu_thread_get_self(cpu->thread);
     cpu->thread_id = qemu_get_thread_id();
     cpu_single_env = env;
+    *alloc_thread_aio_context() = qemu_get_aio_context();
 
     r = kvm_init_vcpu(cpu);
     if (r < 0) {
@@ -815,6 +816,7 @@ static void tcg_exec_all(void);
 static void tcg_signal_cpu_creation(CPUState *cpu, void *data)
 {
     cpu->thread_id = qemu_get_thread_id();
+    *alloc_thread_aio_context() = qemu_get_aio_context();
     cpu->created = true;
 }
 
diff --git a/include/block/aio.h b/include/block/aio.h
index 1836793..6ee9369 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -17,6 +17,7 @@
 #include "qemu-common.h"
 #include "qemu/queue.h"
 #include "qemu/event_notifier.h"
+#include "qemu/tls.h"
 
 typedef struct BlockDriverAIOCB BlockDriverAIOCB;
 typedef void BlockDriverCompletionFunc(void *opaque, int ret);
@@ -74,6 +75,9 @@ typedef struct AioContext {
 /* Returns 1 if there are still outstanding AIO requests; 0 otherwise */
 typedef int (AioFlushEventNotifierHandler)(EventNotifier *e);
 
+/* Each thread can have a default AioContext */
+DECLARE_TLS(AioContext*, thread_aio_context);
+
 /**
  * aio_context_new: Allocate a new AioContext.
  *
diff --git a/main-loop.c b/main-loop.c
index cf36645..51eeb0f 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -142,6 +142,7 @@ int qemu_init_main_loop(void)
 
     gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
     qemu_aio_context = aio_context_new();
+    *get_thread_aio_context() = qemu_aio_context;
     src = aio_get_g_source(qemu_aio_context);
     g_source_attach(src, NULL);
     g_source_unref(src);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context()
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (6 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 07/13] block: add thread_aio_context TLS variable Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14 14:12   ` Paolo Bonzini
  2013-06-14  9:48 ` [Qemu-devel] [RFC 09/13] main-loop: use thread-local AioContext Stefan Hajnoczi
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

Associating a BlockDriverState with a single AioContext is not flexible
enough.  Once we make BlockDriverState thread-safe, it will be possible
to call bdrv_*() functions from multiple event loops.

Use the thread-local AioContext pointer instead of
bdrv_get_aio_context().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.c                   | 6 ------
 block/raw-posix.c         | 4 ++--
 block/raw-win32.c         | 2 +-
 include/block/block_int.h | 7 -------
 4 files changed, 3 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index a3323b2..d986fc8 100644
--- a/block.c
+++ b/block.c
@@ -4582,9 +4582,3 @@ out:
         bdrv_delete(bs);
     }
 }
-
-AioContext *bdrv_get_aio_context(BlockDriverState *bs)
-{
-    /* Currently BlockDriverState always uses the main loop AioContext */
-    return qemu_get_aio_context();
-}
diff --git a/block/raw-posix.c b/block/raw-posix.c
index c0ccf27..821c19f 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -796,7 +796,7 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
     acb->aio_offset = sector_num * 512;
 
     trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    pool = aio_get_thread_pool(*get_thread_aio_context());
     return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
@@ -1460,7 +1460,7 @@ static BlockDriverAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
     acb->aio_offset = 0;
     acb->aio_ioctl_buf = buf;
     acb->aio_ioctl_cmd = req;
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    pool = aio_get_thread_pool(*get_thread_aio_context());
     return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 7c03b6d..b5e59a8 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -158,7 +158,7 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
     acb->aio_offset = sector_num * 512;
 
     trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    pool = aio_get_thread_pool(*get_thread_aio_context());
     return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index ba52247..88c8b52 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -298,13 +298,6 @@ int get_tmp_filename(char *filename, int size);
 void bdrv_set_io_limits(BlockDriverState *bs,
                         BlockIOLimit *io_limits);
 
-/**
- * bdrv_get_aio_context:
- *
- * Returns: the currently bound #AioContext
- */
-AioContext *bdrv_get_aio_context(BlockDriverState *bs);
-
 #ifdef _WIN32
 int is_windows_drive(const char *filename);
 #endif
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 09/13] main-loop: use thread-local AioContext
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (7 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context() Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 10/13] block: disable I/O throttling outside main loop Stefan Hajnoczi
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

qemu_bh_new() and several other functions use qemu_aio_context, the
global QEMU main loop AioContext.  Now that we have introduced threads
with their own AioContext and a thread-local AioContext pointer, convert
qemu_bh_new() and friends to use the thread-local AioContext pointer.

qemu_bh_new() now creates a QEMUBH for the current thread's AioContext.
This is the right thing to do - it allows existing code to work outside
the main loop thread.

One exception: qemu_notify_event() still uses the global
qemu_aio_context because all callers expect to kick the main loop, not
some other event loop.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 main-loop.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/main-loop.c b/main-loop.c
index 51eeb0f..2317c54 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -319,7 +319,8 @@ void qemu_del_wait_object(HANDLE handle, WaitObjectFunc *func, void *opaque)
 
 void qemu_fd_register(int fd)
 {
-    WSAEventSelect(fd, event_notifier_get_handle(&qemu_aio_context->notifier),
+    HANDLE h = event_notifier_get_handle(&*get_thread_aio_context()->notifier);
+    WSAEventSelect(fd, h,
                    FD_READ | FD_ACCEPT | FD_CLOSE |
                    FD_CONNECT | FD_WRITE | FD_OOB);
 }
@@ -477,12 +478,12 @@ int main_loop_wait(int nonblocking)
 
 QEMUBH *qemu_bh_new(QEMUBHFunc *cb, void *opaque)
 {
-    return aio_bh_new(qemu_aio_context, cb, opaque);
+    return aio_bh_new(*get_thread_aio_context(), cb, opaque);
 }
 
 bool qemu_aio_wait(void)
 {
-    return aio_poll(qemu_aio_context, true);
+    return aio_poll(*get_thread_aio_context(), true);
 }
 
 #ifdef CONFIG_POSIX
@@ -492,8 +493,8 @@ void qemu_aio_set_fd_handler(int fd,
                              AioFlushHandler *io_flush,
                              void *opaque)
 {
-    aio_set_fd_handler(qemu_aio_context, fd, io_read, io_write, io_flush,
-                       opaque);
+    aio_set_fd_handler(*get_thread_aio_context(), fd,
+                       io_read, io_write, io_flush, opaque);
 }
 #endif
 
@@ -501,5 +502,6 @@ void qemu_aio_set_event_notifier(EventNotifier *notifier,
                                  EventNotifierHandler *io_read,
                                  AioFlushEventNotifierHandler *io_flush)
 {
-    aio_set_event_notifier(qemu_aio_context, notifier, io_read, io_flush);
+    aio_set_event_notifier(*get_thread_aio_context(), notifier,
+                           io_read, io_flush);
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 10/13] block: disable I/O throttling outside main loop
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (8 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 09/13] main-loop: use thread-local AioContext Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 11/13] dataplane: use block layer for I/O Stefan Hajnoczi
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

Timers only work in the main loop.  This means threads running their own
AioContext cannot use I/O throttling for now.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/block.c b/block.c
index d986fc8..2d7a0f8 100644
--- a/block.c
+++ b/block.c
@@ -169,6 +169,11 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs,
 {
     int64_t wait_time = -1;
 
+    /* Timers currently only work in the main loop */
+    if (*get_thread_aio_context() != qemu_get_aio_context()) {
+        return;
+    }
+
     if (!qemu_co_queue_empty(&bs->throttled_reqs)) {
         qemu_co_queue_wait(&bs->throttled_reqs);
     }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 11/13] dataplane: use block layer for I/O
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (9 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 10/13] block: disable I/O throttling outside main loop Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 12/13] dataplane: drop ioq Linux AIO request queue Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 13/13] block: drop raw_get_aio_fd() Stefan Hajnoczi
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

Use the QEMU block layer instead of custom Linux AIO code.  A couple of
changes are necessary to make request processing work in this new
environment:

1. Replace ioq_*() calls with bdrv_aio_*().  We finally support
   asynchronous flush since we get it for free from the QEMU block
   layer!

2. Add a data buffer QEMUIOVector field to struct VirtIOBlockRequest.
   We previously relied on io_submit(2) semantics where the kernel
   copied in iovecs, and we therefore didn't keep them around.  The QEMU
   block layer *does* require that the iovecs are alive across the
   duration of the request.

Note that we still prevent block jobs, hot unplug, and live migration
since there is no synchronization with the main thread.  The next patch
series will introduce a mechanism to arbitrate block layer calls between
threads.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/dataplane/virtio-blk.c | 288 +++++++++++++---------------------------
 1 file changed, 92 insertions(+), 196 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2faed43..aeb3668 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -17,7 +17,6 @@
 #include "qemu/thread.h"
 #include "qemu/error-report.h"
 #include "hw/virtio/dataplane/vring.h"
-#include "ioq.h"
 #include "migration/migration.h"
 #include "block/block.h"
 #include "hw/virtio/virtio-blk.h"
@@ -34,11 +33,10 @@ enum {
 };
 
 typedef struct {
-    struct iocb iocb;               /* Linux AIO control block */
-    QEMUIOVector *inhdr;            /* iovecs for virtio_blk_inhdr */
+    VirtIOBlockDataPlane *dataplane;
+    QEMUIOVector data;              /* data buffer */
+    QEMUIOVector inhdr;             /* iovecs for virtio_blk_inhdr */
     unsigned int head;              /* vring descriptor index */
-    struct iovec *bounce_iov;       /* used if guest buffers are unaligned */
-    QEMUIOVector *read_qiov;        /* for read completion /w bounce buffer */
 } VirtIOBlockRequest;
 
 struct VirtIOBlockDataPlane {
@@ -48,7 +46,7 @@ struct VirtIOBlockDataPlane {
     QemuThread thread;
 
     VirtIOBlkConf *blk;
-    int fd;                         /* image file descriptor */
+    BlockDriverState *bs;           /* block device */
 
     VirtIODevice *vdev;
     Vring vring;                    /* virtqueue vring */
@@ -60,14 +58,8 @@ struct VirtIOBlockDataPlane {
      * use it).
      */
     AioContext *ctx;
-    EventNotifier io_notifier;      /* Linux AIO completion */
     EventNotifier host_notifier;    /* doorbell */
 
-    IOQueue ioqueue;                /* Linux AIO queue (should really be per
-                                       dataplane thread) */
-    VirtIOBlockRequest requests[REQ_MAX]; /* pool of requests, managed by the
-                                             queue */
-
     unsigned int num_reqs;
 
     Error *migration_blocker;
@@ -83,133 +75,113 @@ static void notify_guest(VirtIOBlockDataPlane *s)
     event_notifier_set(s->guest_notifier);
 }
 
-static void complete_request(struct iocb *iocb, ssize_t ret, void *opaque)
+static void complete_request(VirtIOBlockRequest *req,
+                             unsigned char status,
+                             int len)
 {
-    VirtIOBlockDataPlane *s = opaque;
-    VirtIOBlockRequest *req = container_of(iocb, VirtIOBlockRequest, iocb);
-    struct virtio_blk_inhdr hdr;
-    int len;
-
-    if (likely(ret >= 0)) {
-        hdr.status = VIRTIO_BLK_S_OK;
-        len = ret;
-    } else {
-        hdr.status = VIRTIO_BLK_S_IOERR;
-        len = 0;
-    }
-
-    trace_virtio_blk_data_plane_complete_request(s, req->head, ret);
-
-    if (req->read_qiov) {
-        assert(req->bounce_iov);
-        qemu_iovec_from_buf(req->read_qiov, 0, req->bounce_iov->iov_base, len);
-        qemu_iovec_destroy(req->read_qiov);
-        g_slice_free(QEMUIOVector, req->read_qiov);
-    }
-
-    if (req->bounce_iov) {
-        qemu_vfree(req->bounce_iov->iov_base);
-        g_slice_free(struct iovec, req->bounce_iov);
-    }
+    VirtIOBlockDataPlane *s = req->dataplane;
+    unsigned int head = req->head;
+    struct virtio_blk_inhdr hdr = {
+        .status = status,
+    };
 
-    qemu_iovec_from_buf(req->inhdr, 0, &hdr, sizeof(hdr));
-    qemu_iovec_destroy(req->inhdr);
-    g_slice_free(QEMUIOVector, req->inhdr);
+    qemu_iovec_from_buf(&req->inhdr, 0, &hdr, sizeof(hdr));
+    qemu_iovec_destroy(&req->inhdr);
+    qemu_iovec_destroy(&req->data);
+    g_slice_free(VirtIOBlockRequest, req);
 
     /* According to the virtio specification len should be the number of bytes
      * written to, but for virtio-blk it seems to be the number of bytes
      * transferred plus the status bytes.
      */
-    vring_push(&s->vring, req->head, len + sizeof(hdr));
+    vring_push(&s->vring, head, len + sizeof(hdr));
+    notify_guest(s);
 
     s->num_reqs--;
 }
 
-static void complete_request_early(VirtIOBlockDataPlane *s, unsigned int head,
-                                   QEMUIOVector *inhdr, unsigned char status)
+static void request_cb(void *opaque, int ret)
 {
-    struct virtio_blk_inhdr hdr = {
-        .status = status,
-    };
+    VirtIOBlockRequest *req = opaque;
+    VirtIOBlockDataPlane *s = req->dataplane;
+    unsigned char status;
+    int len;
 
-    qemu_iovec_from_buf(inhdr, 0, &hdr, sizeof(hdr));
-    qemu_iovec_destroy(inhdr);
-    g_slice_free(QEMUIOVector, inhdr);
+    /* Completion must be invoked in our thread */
+    assert(*get_thread_aio_context() == s->ctx);
 
-    vring_push(&s->vring, head, sizeof(hdr));
-    notify_guest(s);
+    if (likely(ret >= 0)) {
+        status = VIRTIO_BLK_S_OK;
+        len = ret;
+    } else {
+        status = VIRTIO_BLK_S_IOERR;
+        len = 0;
+    }
+
+    trace_virtio_blk_data_plane_complete_request(s, req->head, ret);
+
+    complete_request(req, status, len);
 }
 
 /* Get disk serial number */
-static void do_get_id_cmd(VirtIOBlockDataPlane *s,
-                          struct iovec *iov, unsigned int iov_cnt,
-                          unsigned int head, QEMUIOVector *inhdr)
+static void do_get_id_cmd(VirtIOBlockRequest *req,
+                          struct iovec *iov, unsigned int iov_cnt)
 {
+    VirtIOBlockDataPlane *s = req->dataplane;
     char id[VIRTIO_BLK_ID_BYTES];
 
     /* Serial number not NUL-terminated when shorter than buffer */
     strncpy(id, s->blk->serial ? s->blk->serial : "", sizeof(id));
     iov_from_buf(iov, iov_cnt, 0, id, sizeof(id));
-    complete_request_early(s, head, inhdr, VIRTIO_BLK_S_OK);
+    complete_request(req, VIRTIO_BLK_S_OK, 0);
 }
 
-static int do_rdwr_cmd(VirtIOBlockDataPlane *s, bool read,
+static int do_rdwr_cmd(VirtIOBlockRequest *req, bool read,
                        struct iovec *iov, unsigned int iov_cnt,
-                       long long offset, unsigned int head,
-                       QEMUIOVector *inhdr)
+                       long long offset)
 {
-    struct iocb *iocb;
-    QEMUIOVector qiov;
-    struct iovec *bounce_iov = NULL;
-    QEMUIOVector *read_qiov = NULL;
-
-    qemu_iovec_init_external(&qiov, iov, iov_cnt);
-    if (!bdrv_qiov_is_aligned(s->blk->conf.bs, &qiov)) {
-        void *bounce_buffer = qemu_blockalign(s->blk->conf.bs, qiov.size);
-
-        if (read) {
-            /* Need to copy back from bounce buffer on completion */
-            read_qiov = g_slice_new(QEMUIOVector);
-            qemu_iovec_init(read_qiov, iov_cnt);
-            qemu_iovec_concat_iov(read_qiov, iov, iov_cnt, 0, qiov.size);
-        } else {
-            qemu_iovec_to_buf(&qiov, 0, bounce_buffer, qiov.size);
-        }
+    VirtIOBlockDataPlane *s = req->dataplane;
+    int64_t sector_num = offset / BDRV_SECTOR_SIZE;
+    int nb_sectors;
+    unsigned int i;
 
-        /* Redirect I/O to aligned bounce buffer */
-        bounce_iov = g_slice_new(struct iovec);
-        bounce_iov->iov_base = bounce_buffer;
-        bounce_iov->iov_len = qiov.size;
-        iov = bounce_iov;
-        iov_cnt = 1;
+    for (i = 0; i < iov_cnt; i++) {
+        qemu_iovec_add(&req->data, iov[i].iov_base, iov[i].iov_len);
     }
+    nb_sectors = req->data.size / BDRV_SECTOR_SIZE;
 
-    iocb = ioq_rdwr(&s->ioqueue, read, iov, iov_cnt, offset);
-
-    /* Fill in virtio block metadata needed for completion */
-    VirtIOBlockRequest *req = container_of(iocb, VirtIOBlockRequest, iocb);
-    req->head = head;
-    req->inhdr = inhdr;
-    req->bounce_iov = bounce_iov;
-    req->read_qiov = read_qiov;
+    if (read) {
+        bdrv_aio_readv(s->bs, sector_num, &req->data, nb_sectors,
+                       request_cb, req);
+    } else {
+        bdrv_aio_writev(s->bs, sector_num, &req->data, nb_sectors,
+                        request_cb, req);
+    }
     return 0;
 }
 
-static int process_request(IOQueue *ioq, struct iovec iov[],
+static int process_request(VirtIOBlockDataPlane *s, struct iovec iov[],
                            unsigned int out_num, unsigned int in_num,
                            unsigned int head)
 {
-    VirtIOBlockDataPlane *s = container_of(ioq, VirtIOBlockDataPlane, ioqueue);
+    VirtIOBlockRequest *req;
     struct iovec *in_iov = &iov[out_num];
     struct virtio_blk_outhdr outhdr;
-    QEMUIOVector *inhdr;
     size_t in_size;
 
+    req = g_slice_new(VirtIOBlockRequest);
+    req->dataplane = s;
+    req->head = head;
+    qemu_iovec_init(&req->data, out_num + in_num);
+    qemu_iovec_init(&req->inhdr, 1);
+
+    s->num_reqs++;
+
     /* Copy in outhdr */
     if (unlikely(iov_to_buf(iov, out_num, 0, &outhdr,
                             sizeof(outhdr)) != sizeof(outhdr))) {
         error_report("virtio-blk request outhdr too short");
-        return -EFAULT;
+        goto fault;
     }
     iov_discard_front(&iov, &out_num, sizeof(outhdr));
 
@@ -217,11 +189,9 @@ static int process_request(IOQueue *ioq, struct iovec iov[],
     in_size = iov_size(in_iov, in_num);
     if (in_size < sizeof(struct virtio_blk_inhdr)) {
         error_report("virtio_blk request inhdr too short");
-        return -EFAULT;
+        goto fault;
     }
-    inhdr = g_slice_new(QEMUIOVector);
-    qemu_iovec_init(inhdr, 1);
-    qemu_iovec_concat_iov(inhdr, in_iov, in_num,
+    qemu_iovec_concat_iov(&req->inhdr, in_iov, in_num,
             in_size - sizeof(struct virtio_blk_inhdr),
             sizeof(struct virtio_blk_inhdr));
     iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
@@ -231,37 +201,37 @@ static int process_request(IOQueue *ioq, struct iovec iov[],
 
     switch (outhdr.type) {
     case VIRTIO_BLK_T_IN:
-        do_rdwr_cmd(s, true, in_iov, in_num, outhdr.sector * 512, head, inhdr);
+        do_rdwr_cmd(req, true, in_iov, in_num, outhdr.sector * 512);
         return 0;
 
     case VIRTIO_BLK_T_OUT:
-        do_rdwr_cmd(s, false, iov, out_num, outhdr.sector * 512, head, inhdr);
+        do_rdwr_cmd(req, false, iov, out_num, outhdr.sector * 512);
         return 0;
 
     case VIRTIO_BLK_T_SCSI_CMD:
         /* TODO support SCSI commands */
-        complete_request_early(s, head, inhdr, VIRTIO_BLK_S_UNSUPP);
+        complete_request(req, VIRTIO_BLK_S_UNSUPP, 0);
         return 0;
 
     case VIRTIO_BLK_T_FLUSH:
-        /* TODO fdsync not supported by Linux AIO, do it synchronously here! */
-        if (qemu_fdatasync(s->fd) < 0) {
-            complete_request_early(s, head, inhdr, VIRTIO_BLK_S_IOERR);
-        } else {
-            complete_request_early(s, head, inhdr, VIRTIO_BLK_S_OK);
-        }
+        bdrv_aio_flush(s->bs, request_cb, req);
         return 0;
 
     case VIRTIO_BLK_T_GET_ID:
-        do_get_id_cmd(s, in_iov, in_num, head, inhdr);
+        do_get_id_cmd(req, in_iov, in_num);
         return 0;
 
     default:
         error_report("virtio-blk unsupported request type %#x", outhdr.type);
-        qemu_iovec_destroy(inhdr);
-        g_slice_free(QEMUIOVector, inhdr);
-        return -EFAULT;
+        goto fault;
     }
+
+fault:
+    qemu_iovec_destroy(&req->data);
+    qemu_iovec_destroy(&req->inhdr);
+    g_slice_free(VirtIOBlockRequest, req);
+    s->num_reqs--;
+    return -EFAULT;
 }
 
 static int flush_true(EventNotifier *e)
@@ -273,20 +243,8 @@ static void handle_notify(EventNotifier *e)
 {
     VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
                                            host_notifier);
-
-    /* There is one array of iovecs into which all new requests are extracted
-     * from the vring.  Requests are read from the vring and the translated
-     * descriptors are written to the iovecs array.  The iovecs do not have to
-     * persist across handle_notify() calls because the kernel copies the
-     * iovecs on io_submit().
-     *
-     * Handling io_submit() EAGAIN may require storing the requests across
-     * handle_notify() calls until the kernel has sufficient resources to
-     * accept more I/O.  This is not implemented yet.
-     */
-    struct iovec iovec[VRING_MAX];
-    struct iovec *end = &iovec[VRING_MAX];
-    struct iovec *iov = iovec;
+    struct iovec iov[VRING_MAX];
+    struct iovec *end = &iov[VRING_MAX];
 
     /* When a request is read from the vring, the index of the first descriptor
      * (aka head) is returned so that the completed request can be pushed onto
@@ -297,7 +255,6 @@ static void handle_notify(EventNotifier *e)
      */
     int head;
     unsigned int out_num = 0, in_num = 0;
-    unsigned int num_queued;
 
     event_notifier_test_and_clear(&s->host_notifier);
     for (;;) {
@@ -313,11 +270,10 @@ static void handle_notify(EventNotifier *e)
             trace_virtio_blk_data_plane_process_request(s, out_num, in_num,
                                                         head);
 
-            if (process_request(&s->ioqueue, iov, out_num, in_num, head) < 0) {
+            if (process_request(s, iov, out_num, in_num, head) < 0) {
                 vring_set_broken(&s->vring);
                 break;
             }
-            iov += out_num + in_num;
         }
 
         if (likely(head == -EAGAIN)) { /* vring emptied */
@@ -327,58 +283,18 @@ static void handle_notify(EventNotifier *e)
             if (vring_enable_notification(s->vdev, &s->vring)) {
                 break;
             }
-        } else { /* head == -ENOBUFS or fatal error, iovecs[] is depleted */
-            /* Since there are no iovecs[] left, stop processing for now.  Do
-             * not re-enable guest->host notifies since the I/O completion
-             * handler knows to check for more vring descriptors anyway.
-             */
-            break;
-        }
-    }
-
-    num_queued = ioq_num_queued(&s->ioqueue);
-    if (num_queued > 0) {
-        s->num_reqs += num_queued;
-
-        int rc = ioq_submit(&s->ioqueue);
-        if (unlikely(rc < 0)) {
-            fprintf(stderr, "ioq_submit failed %d\n", rc);
-            exit(1);
+        } else {
+            return; /* Fatal error, leave ring in broken state */
         }
     }
 }
 
-static int flush_io(EventNotifier *e)
-{
-    VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
-                                           io_notifier);
-
-    return s->num_reqs > 0;
-}
-
-static void handle_io(EventNotifier *e)
-{
-    VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
-                                           io_notifier);
-
-    event_notifier_test_and_clear(&s->io_notifier);
-    if (ioq_run_completion(&s->ioqueue, complete_request, s) > 0) {
-        notify_guest(s);
-    }
-
-    /* If there were more requests than iovecs, the vring will not be empty yet
-     * so check again.  There should now be enough resources to process more
-     * requests.
-     */
-    if (unlikely(vring_more_avail(&s->vring))) {
-        handle_notify(&s->host_notifier);
-    }
-}
-
 static void *data_plane_thread(void *opaque)
 {
     VirtIOBlockDataPlane *s = opaque;
 
+    *alloc_thread_aio_context() = s->ctx;
+
     do {
         aio_poll(s->ctx, true);
     } while (!s->stopping || s->num_reqs > 0);
@@ -399,7 +315,6 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
                                   VirtIOBlockDataPlane **dataplane)
 {
     VirtIOBlockDataPlane *s;
-    int fd;
 
     *dataplane = NULL;
 
@@ -418,20 +333,13 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
         return false;
     }
 
-    fd = raw_get_aio_fd(blk->conf.bs);
-    if (fd < 0) {
-        error_report("drive is incompatible with x-data-plane, "
-                     "use format=raw,cache=none,aio=native");
-        return false;
-    }
-
     s = g_new0(VirtIOBlockDataPlane, 1);
     s->vdev = vdev;
-    s->fd = fd;
     s->blk = blk;
+    s->bs = blk->conf.bs;
 
     /* Prevent block operations that conflict with data plane thread */
-    bdrv_set_in_use(blk->conf.bs, 1);
+    bdrv_set_in_use(s->bs, 1);
 
     error_setg(&s->migration_blocker,
             "x-data-plane does not support migration");
@@ -450,7 +358,7 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
     virtio_blk_data_plane_stop(s);
     migrate_del_blocker(s->migration_blocker);
     error_free(s->migration_blocker);
-    bdrv_set_in_use(s->blk->conf.bs, 0);
+    bdrv_set_in_use(s->bs, 0);
     g_free(s);
 }
 
@@ -459,7 +367,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev)));
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
     VirtQueue *vq;
-    int i;
 
     if (s->started) {
         return;
@@ -488,14 +395,6 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
     s->host_notifier = *virtio_queue_get_host_notifier(vq);
     aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify, flush_true);
 
-    /* Set up ioqueue */
-    ioq_init(&s->ioqueue, s->fd, REQ_MAX);
-    for (i = 0; i < ARRAY_SIZE(s->requests); i++) {
-        ioq_put_iocb(&s->ioqueue, &s->requests[i].iocb);
-    }
-    s->io_notifier = *ioq_get_notifier(&s->ioqueue);
-    aio_set_event_notifier(s->ctx, &s->io_notifier, handle_io, flush_io);
-
     s->started = true;
     trace_virtio_blk_data_plane_start(s);
 
@@ -526,9 +425,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
         qemu_thread_join(&s->thread);
     }
 
-    aio_set_event_notifier(s->ctx, &s->io_notifier, NULL, NULL);
-    ioq_cleanup(&s->ioqueue);
-
     aio_set_event_notifier(s->ctx, &s->host_notifier, NULL, NULL);
     k->set_host_notifier(qbus->parent, 0, false);
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 12/13] dataplane: drop ioq Linux AIO request queue
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (10 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 11/13] dataplane: use block layer for I/O Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  2013-06-14  9:48 ` [Qemu-devel] [RFC 13/13] block: drop raw_get_aio_fd() Stefan Hajnoczi
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

virtio-blk data plane used a custom Linux AIO request queue called
"ioq".  Now that virtio-blk data plane uses the QEMU block layer we can
drop ioq.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 hw/block/dataplane/Makefile.objs |   2 +-
 hw/block/dataplane/ioq.c         | 117 ---------------------------------------
 hw/block/dataplane/ioq.h         |  57 -------------------
 3 files changed, 1 insertion(+), 175 deletions(-)
 delete mode 100644 hw/block/dataplane/ioq.c
 delete mode 100644 hw/block/dataplane/ioq.h

diff --git a/hw/block/dataplane/Makefile.objs b/hw/block/dataplane/Makefile.objs
index 9da2eb8..e786f66 100644
--- a/hw/block/dataplane/Makefile.objs
+++ b/hw/block/dataplane/Makefile.objs
@@ -1 +1 @@
-obj-y += ioq.o virtio-blk.o
+obj-y += virtio-blk.o
diff --git a/hw/block/dataplane/ioq.c b/hw/block/dataplane/ioq.c
deleted file mode 100644
index f709f87..0000000
--- a/hw/block/dataplane/ioq.c
+++ /dev/null
@@ -1,117 +0,0 @@
-/*
- * Linux AIO request queue
- *
- * Copyright 2012 IBM, Corp.
- * Copyright 2012 Red Hat, Inc. and/or its affiliates
- *
- * Authors:
- *   Stefan Hajnoczi <stefanha@redhat.com>
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- */
-
-#include "ioq.h"
-
-void ioq_init(IOQueue *ioq, int fd, unsigned int max_reqs)
-{
-    int rc;
-
-    ioq->fd = fd;
-    ioq->max_reqs = max_reqs;
-
-    memset(&ioq->io_ctx, 0, sizeof ioq->io_ctx);
-    rc = io_setup(max_reqs, &ioq->io_ctx);
-    if (rc != 0) {
-        fprintf(stderr, "ioq io_setup failed %d\n", rc);
-        exit(1);
-    }
-
-    rc = event_notifier_init(&ioq->io_notifier, 0);
-    if (rc != 0) {
-        fprintf(stderr, "ioq io event notifier creation failed %d\n", rc);
-        exit(1);
-    }
-
-    ioq->freelist = g_malloc0(sizeof ioq->freelist[0] * max_reqs);
-    ioq->freelist_idx = 0;
-
-    ioq->queue = g_malloc0(sizeof ioq->queue[0] * max_reqs);
-    ioq->queue_idx = 0;
-}
-
-void ioq_cleanup(IOQueue *ioq)
-{
-    g_free(ioq->freelist);
-    g_free(ioq->queue);
-
-    event_notifier_cleanup(&ioq->io_notifier);
-    io_destroy(ioq->io_ctx);
-}
-
-EventNotifier *ioq_get_notifier(IOQueue *ioq)
-{
-    return &ioq->io_notifier;
-}
-
-struct iocb *ioq_get_iocb(IOQueue *ioq)
-{
-    /* Underflow cannot happen since ioq is sized for max_reqs */
-    assert(ioq->freelist_idx != 0);
-
-    struct iocb *iocb = ioq->freelist[--ioq->freelist_idx];
-    ioq->queue[ioq->queue_idx++] = iocb;
-    return iocb;
-}
-
-void ioq_put_iocb(IOQueue *ioq, struct iocb *iocb)
-{
-    /* Overflow cannot happen since ioq is sized for max_reqs */
-    assert(ioq->freelist_idx != ioq->max_reqs);
-
-    ioq->freelist[ioq->freelist_idx++] = iocb;
-}
-
-struct iocb *ioq_rdwr(IOQueue *ioq, bool read, struct iovec *iov,
-                      unsigned int count, long long offset)
-{
-    struct iocb *iocb = ioq_get_iocb(ioq);
-
-    if (read) {
-        io_prep_preadv(iocb, ioq->fd, iov, count, offset);
-    } else {
-        io_prep_pwritev(iocb, ioq->fd, iov, count, offset);
-    }
-    io_set_eventfd(iocb, event_notifier_get_fd(&ioq->io_notifier));
-    return iocb;
-}
-
-int ioq_submit(IOQueue *ioq)
-{
-    int rc = io_submit(ioq->io_ctx, ioq->queue_idx, ioq->queue);
-    ioq->queue_idx = 0; /* reset */
-    return rc;
-}
-
-int ioq_run_completion(IOQueue *ioq, IOQueueCompletion *completion,
-                       void *opaque)
-{
-    struct io_event events[ioq->max_reqs];
-    int nevents, i;
-
-    do {
-        nevents = io_getevents(ioq->io_ctx, 0, ioq->max_reqs, events, NULL);
-    } while (nevents < 0 && errno == EINTR);
-    if (nevents < 0) {
-        return nevents;
-    }
-
-    for (i = 0; i < nevents; i++) {
-        ssize_t ret = ((uint64_t)events[i].res2 << 32) | events[i].res;
-
-        completion(events[i].obj, ret, opaque);
-        ioq_put_iocb(ioq, events[i].obj);
-    }
-    return nevents;
-}
diff --git a/hw/block/dataplane/ioq.h b/hw/block/dataplane/ioq.h
deleted file mode 100644
index b49b5de..0000000
--- a/hw/block/dataplane/ioq.h
+++ /dev/null
@@ -1,57 +0,0 @@
-/*
- * Linux AIO request queue
- *
- * Copyright 2012 IBM, Corp.
- * Copyright 2012 Red Hat, Inc. and/or its affiliates
- *
- * Authors:
- *   Stefan Hajnoczi <stefanha@redhat.com>
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- */
-
-#ifndef IOQ_H
-#define IOQ_H
-
-#include <libaio.h>
-#include "qemu/event_notifier.h"
-
-typedef struct {
-    int fd;                         /* file descriptor */
-    unsigned int max_reqs;          /* max length of freelist and queue */
-
-    io_context_t io_ctx;            /* Linux AIO context */
-    EventNotifier io_notifier;      /* Linux AIO eventfd */
-
-    /* Requests can complete in any order so a free list is necessary to manage
-     * available iocbs.
-     */
-    struct iocb **freelist;         /* free iocbs */
-    unsigned int freelist_idx;
-
-    /* Multiple requests are queued up before submitting them all in one go */
-    struct iocb **queue;            /* queued iocbs */
-    unsigned int queue_idx;
-} IOQueue;
-
-void ioq_init(IOQueue *ioq, int fd, unsigned int max_reqs);
-void ioq_cleanup(IOQueue *ioq);
-EventNotifier *ioq_get_notifier(IOQueue *ioq);
-struct iocb *ioq_get_iocb(IOQueue *ioq);
-void ioq_put_iocb(IOQueue *ioq, struct iocb *iocb);
-struct iocb *ioq_rdwr(IOQueue *ioq, bool read, struct iovec *iov,
-                      unsigned int count, long long offset);
-int ioq_submit(IOQueue *ioq);
-
-static inline unsigned int ioq_num_queued(IOQueue *ioq)
-{
-    return ioq->queue_idx;
-}
-
-typedef void IOQueueCompletion(struct iocb *iocb, ssize_t ret, void *opaque);
-int ioq_run_completion(IOQueue *ioq, IOQueueCompletion *completion,
-                       void *opaque);
-
-#endif /* IOQ_H */
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [RFC 13/13] block: drop raw_get_aio_fd()
  2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
                   ` (11 preceding siblings ...)
  2013-06-14  9:48 ` [Qemu-devel] [RFC 12/13] dataplane: drop ioq Linux AIO request queue Stefan Hajnoczi
@ 2013-06-14  9:48 ` Stefan Hajnoczi
  12 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-14  9:48 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Anthony Liguori, Ping Fan Liu,
	Stefan Hajnoczi

virtio-blk data plane only worked with -drive format=raw,aio=native.  It
used raw_get_aio_fd() to fetch the file descriptor from raw-posix.  This
was a layering violation and is no longer necessary now that virtio-blk
data plane uses the QEMU block layer.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/raw-posix.c     | 34 ----------------------------------
 include/block/block.h |  8 --------
 2 files changed, 42 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 821c19f..c4d0000 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1910,40 +1910,6 @@ static BlockDriver bdrv_host_cdrom = {
 };
 #endif /* __FreeBSD__ */
 
-#ifdef CONFIG_LINUX_AIO
-/**
- * Return the file descriptor for Linux AIO
- *
- * This function is a layering violation and should be removed when it becomes
- * possible to call the block layer outside the global mutex.  It allows the
- * caller to hijack the file descriptor so I/O can be performed outside the
- * block layer.
- */
-int raw_get_aio_fd(BlockDriverState *bs)
-{
-    BDRVRawState *s;
-
-    if (!bs->drv) {
-        return -ENOMEDIUM;
-    }
-
-    if (bs->drv == bdrv_find_format("raw")) {
-        bs = bs->file;
-    }
-
-    /* raw-posix has several protocols so just check for raw_aio_readv */
-    if (bs->drv->bdrv_aio_readv != raw_aio_readv) {
-        return -ENOTSUP;
-    }
-
-    s = bs->opaque;
-    if (!s->use_aio) {
-        return -ENOTSUP;
-    }
-    return s->fd;
-}
-#endif /* CONFIG_LINUX_AIO */
-
 static void bdrv_file_init(void)
 {
     /*
diff --git a/include/block/block.h b/include/block/block.h
index 85147bb..4273458 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -364,14 +364,6 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
-#ifdef CONFIG_LINUX_AIO
-int raw_get_aio_fd(BlockDriverState *bs);
-#else
-static inline int raw_get_aio_fd(BlockDriverState *bs)
-{
-    return -ENOTSUP;
-}
-#endif
 
 enum BlockAcctType {
     BDRV_ACCT_READ,
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context()
  2013-06-14  9:48 ` [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context() Stefan Hajnoczi
@ 2013-06-14 14:12   ` Paolo Bonzini
  2013-06-17 14:57     ` Stefan Hajnoczi
  0 siblings, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2013-06-14 14:12 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, Anthony Liguori, Ping Fan Liu, qemu-devel

Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
> Associating a BlockDriverState with a single AioContext is not flexible
> enough.  Once we make BlockDriverState thread-safe, it will be possible
> to call bdrv_*() functions from multiple event loops.

I'm afraid that this is trading some pain now (converting
qemu_bh_new/qemu_set_fd_handler to aio_bh_new/aio_set_fd_handler) for
more pain later (having to make BDS thread-safe).  There aren't that
many (~40) in block layer code.

Making BlockDriverState thread-safe is hard, it is much simpler to run
all the BlockDriverState code in the AioContext thread itself.

There are some things that cannot (basically monitor commands and other
places that are currently using bdrv_drain_all) but they can simply take
a "big AioContext lock".

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [RFC 04/13] virtio-blk: implement BlockDevOps->drain_threads_cb()
  2013-06-14  9:48 ` [Qemu-devel] [RFC 04/13] virtio-blk: implement BlockDevOps->drain_threads_cb() Stefan Hajnoczi
@ 2013-06-14 14:23   ` Paolo Bonzini
  0 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2013-06-14 14:23 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, Anthony Liguori, Ping Fan Liu, qemu-devel

Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
> Drain and stop the dataplane thread when bdrv_drain_all() is called.
> Note that the thread will be restarted in virtio_blk_handle_output() the
> next time the guest kicks the virtqueue.

Long term I still think we want to move towards

    aio_lock(bs)
    bdrv_drain(bs)
    ...
    aio_unlock(bs)

but this approach works well at this point.  It is a hack, but a very
smart one. :)

Add a comment in the code about what will start the dataplane thread,
though.

Paolo

> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  hw/block/virtio-blk.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index cf12469..f9c2b79 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -618,8 +618,20 @@ static void virtio_blk_resize(void *opaque)
>      virtio_notify_config(vdev);
>  }
>  
> +static void virtio_blk_drain_threads(void *opaque)
> +{
> +#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
> +    VirtIOBlock *s = VIRTIO_BLK(opaque);
> +
> +    if (s->dataplane) {
> +        virtio_blk_data_plane_stop(s->dataplane);
> +    }
> +#endif
> +}
> +
>  static const BlockDevOps virtio_block_ops = {
>      .resize_cb = virtio_blk_resize,
> +    .drain_threads_cb = virtio_blk_drain_threads,
>  };
>  
>  void virtio_blk_set_conf(DeviceState *dev, VirtIOBlkConf *blk)
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context()
  2013-06-14 14:12   ` Paolo Bonzini
@ 2013-06-17 14:57     ` Stefan Hajnoczi
  2013-06-17 15:03       ` Paolo Bonzini
  0 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-17 14:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Anthony Liguori, Ping Fan Liu, qemu-devel,
	Stefan Hajnoczi

On Fri, Jun 14, 2013 at 10:12:00AM -0400, Paolo Bonzini wrote:
> Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
> > Associating a BlockDriverState with a single AioContext is not flexible
> > enough.  Once we make BlockDriverState thread-safe, it will be possible
> > to call bdrv_*() functions from multiple event loops.
> 
> I'm afraid that this is trading some pain now (converting
> qemu_bh_new/qemu_set_fd_handler to aio_bh_new/aio_set_fd_handler) for
> more pain later (having to make BDS thread-safe).  There aren't that
> many (~40) in block layer code.
> 
> Making BlockDriverState thread-safe is hard, it is much simpler to run
> all the BlockDriverState code in the AioContext thread itself.
> 
> There are some things that cannot (basically monitor commands and other
> places that are currently using bdrv_drain_all) but they can simply take
> a "big AioContext lock".

I don't like a big AioContext lock that stops the event loop because we
need to support a M:N device:iothread model where stopping an event loop
means artificially stopping other devices too.

Maybe it's workable though since the only commands that would take an
AioContext lock are rare and shouldn't impact performance.  I guess then
it comes down to robustness if a hung NFS mount can affect the entire VM
instead of just hanging an emulated disk device.

Still, I have tried the lock approach and agree it is hard.  I'll
experiment some more and hopefully come up with something that handles
block jobs and the run-time NBD server.

Stefan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context()
  2013-06-17 14:57     ` Stefan Hajnoczi
@ 2013-06-17 15:03       ` Paolo Bonzini
  2013-06-18  9:29         ` Stefan Hajnoczi
  0 siblings, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2013-06-17 15:03 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Anthony Liguori, Ping Fan Liu, qemu-devel,
	Stefan Hajnoczi

Il 17/06/2013 16:57, Stefan Hajnoczi ha scritto:
> On Fri, Jun 14, 2013 at 10:12:00AM -0400, Paolo Bonzini wrote:
>> Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
>>> Associating a BlockDriverState with a single AioContext is not flexible
>>> enough.  Once we make BlockDriverState thread-safe, it will be possible
>>> to call bdrv_*() functions from multiple event loops.
>>
>> I'm afraid that this is trading some pain now (converting
>> qemu_bh_new/qemu_set_fd_handler to aio_bh_new/aio_set_fd_handler) for
>> more pain later (having to make BDS thread-safe).  There aren't that
>> many (~40) in block layer code.
>>
>> Making BlockDriverState thread-safe is hard, it is much simpler to run
>> all the BlockDriverState code in the AioContext thread itself.
>>
>> There are some things that cannot (basically monitor commands and other
>> places that are currently using bdrv_drain_all) but they can simply take
>> a "big AioContext lock".
> 
> I don't like a big AioContext lock that stops the event loop because we
> need to support a M:N device:iothread model where stopping an event loop
> means artificially stopping other devices too.

The event loop would still run via aio_wait(), just in the monitor
thread.  However it would stop the ioeventfd, through a mechanism such
as the one you added in this thread or something like that.

> Maybe it's workable though since the only commands that would take an
> AioContext lock are rare and shouldn't impact performance.

Yes.

> I guess then
> it comes down to robustness if a hung NFS mount can affect the entire VM
> instead of just hanging an emulated disk device.

That would only be a hung NFS mount while running something that drains
a BDS, no?  It would hang the monitor, but it wouldn't be a regression
compared to what we have now obviously.

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context()
  2013-06-17 15:03       ` Paolo Bonzini
@ 2013-06-18  9:29         ` Stefan Hajnoczi
  0 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-18  9:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Anthony Liguori, Ping Fan Liu, qemu-devel,
	Stefan Hajnoczi

On Mon, Jun 17, 2013 at 05:03:12PM +0200, Paolo Bonzini wrote:
> Il 17/06/2013 16:57, Stefan Hajnoczi ha scritto:
> > On Fri, Jun 14, 2013 at 10:12:00AM -0400, Paolo Bonzini wrote:
> >> Il 14/06/2013 05:48, Stefan Hajnoczi ha scritto:
> > I guess then
> > it comes down to robustness if a hung NFS mount can affect the entire VM
> > instead of just hanging an emulated disk device.
> 
> That would only be a hung NFS mount while running something that drains
> a BDS, no?  It would hang the monitor, but it wouldn't be a regression
> compared to what we have now obviously.

Yes, no regression from what we have today.

Stefan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers
  2013-06-14  9:48 ` [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers Stefan Hajnoczi
@ 2013-06-20  7:26   ` Fam Zheng
  2013-06-20  8:50     ` Paolo Bonzini
  0 siblings, 1 reply; 22+ messages in thread
From: Fam Zheng @ 2013-06-20  7:26 UTC (permalink / raw)
  To: Stefan Hajnoczi, Paolo Bonzini
  Cc: Kevin Wolf, Anthony Liguori, Ping Fan Liu, qemu-devel

On Fri, 06/14 11:48, Stefan Hajnoczi wrote:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> Fast TLS is not available on some platforms, but it is always nice to
> use it.  This wrapper implementation falls back to pthread_get/setspecific
> on POSIX systems that lack __thread, but uses the dynamic linker's TLS
> support on Linux and Windows.
> 
> The user shall call alloc_foo() in every thread that needs to access the
> variable---exactly once and before any access.  foo is the name of the
> variable as passed to DECLARE_TLS and DEFINE_TLS.  Then, get_foo() will
> return the address of the variable.  It is guaranteed to remain the same
> across the lifetime of a thread, so you can cache it.

Would tls_alloc_foo() and tls_get_foo() be easier to read and less
possible for name conflict?

Fam

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers
  2013-06-20  7:26   ` Fam Zheng
@ 2013-06-20  8:50     ` Paolo Bonzini
  2013-06-20 12:54       ` Stefan Hajnoczi
  0 siblings, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2013-06-20  8:50 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel, Kevin Wolf, Anthony Liguori,
	Ping Fan Liu

Il 20/06/2013 09:26, Fam Zheng ha scritto:
> On Fri, 06/14 11:48, Stefan Hajnoczi wrote:
>> From: Paolo Bonzini <pbonzini@redhat.com>
>>
>> Fast TLS is not available on some platforms, but it is always nice to
>> use it.  This wrapper implementation falls back to pthread_get/setspecific
>> on POSIX systems that lack __thread, but uses the dynamic linker's TLS
>> support on Linux and Windows.
>>
>> The user shall call alloc_foo() in every thread that needs to access the
>> variable---exactly once and before any access.  foo is the name of the
>> variable as passed to DECLARE_TLS and DEFINE_TLS.  Then, get_foo() will
>> return the address of the variable.  It is guaranteed to remain the same
>> across the lifetime of a thread, so you can cache it.
> 
> Would tls_alloc_foo() and tls_get_foo() be easier to read and less
> possible for name conflict?

Fine by me.

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers
  2013-06-20  8:50     ` Paolo Bonzini
@ 2013-06-20 12:54       ` Stefan Hajnoczi
  0 siblings, 0 replies; 22+ messages in thread
From: Stefan Hajnoczi @ 2013-06-20 12:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, Anthony Liguori, Ping Fan Liu, qemu-devel

On Thu, Jun 20, 2013 at 10:50:32AM +0200, Paolo Bonzini wrote:
> Il 20/06/2013 09:26, Fam Zheng ha scritto:
> > On Fri, 06/14 11:48, Stefan Hajnoczi wrote:
> >> From: Paolo Bonzini <pbonzini@redhat.com>
> >>
> >> Fast TLS is not available on some platforms, but it is always nice to
> >> use it.  This wrapper implementation falls back to pthread_get/setspecific
> >> on POSIX systems that lack __thread, but uses the dynamic linker's TLS
> >> support on Linux and Windows.
> >>
> >> The user shall call alloc_foo() in every thread that needs to access the
> >> variable---exactly once and before any access.  foo is the name of the
> >> variable as passed to DECLARE_TLS and DEFINE_TLS.  Then, get_foo() will
> >> return the address of the variable.  It is guaranteed to remain the same
> >> across the lifetime of a thread, so you can cache it.
> > 
> > Would tls_alloc_foo() and tls_get_foo() be easier to read and less
> > possible for name conflict?
> 
> Fine by me.

Nice, idea.  Will fix in the next version.

Stefan

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-06-20 13:02 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-14  9:48 [Qemu-devel] [RFC 00/13] dataplane: use block layer Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 01/13] block: fix bdrv_flush() ordering in bdrv_close() Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 02/13] dataplane: sync virtio.c and vring.c virtqueue state Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 03/13] block: add BlockDevOps->drain_threads_cb() Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 04/13] virtio-blk: implement BlockDevOps->drain_threads_cb() Stefan Hajnoczi
2013-06-14 14:23   ` Paolo Bonzini
2013-06-14  9:48 ` [Qemu-devel] [RFC 05/13] exec: do not use qemu/tls.h Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 06/13] qemu-thread: add TLS wrappers Stefan Hajnoczi
2013-06-20  7:26   ` Fam Zheng
2013-06-20  8:50     ` Paolo Bonzini
2013-06-20 12:54       ` Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 07/13] block: add thread_aio_context TLS variable Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 08/13] block: drop bdrv_get_aio_context() Stefan Hajnoczi
2013-06-14 14:12   ` Paolo Bonzini
2013-06-17 14:57     ` Stefan Hajnoczi
2013-06-17 15:03       ` Paolo Bonzini
2013-06-18  9:29         ` Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 09/13] main-loop: use thread-local AioContext Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 10/13] block: disable I/O throttling outside main loop Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 11/13] dataplane: use block layer for I/O Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 12/13] dataplane: drop ioq Linux AIO request queue Stefan Hajnoczi
2013-06-14  9:48 ` [Qemu-devel] [RFC 13/13] block: drop raw_get_aio_fd() Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).