qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick
@ 2015-08-26  0:17 Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 1/9] i8257: rewrite DMA_schedule to avoid hooking into the CPU loop Paolo Bonzini
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

This version of the signal-free qemu_cpu_kick patches is, ehm, much
better.  Variable are accessed either with Java-style volatiles or
protected by memory barriers, and the cleanups go further by removing
qemu/tls.h and C volatiles.

The logic is relatively simple.  The I/O thread does (letters in
parentheses indicates the synchronizes-with edges):

    run_on_cpu or similar
    ...
    seq_cst write 1 to exit_request        (C)
    seq_cst read tcg_current_cpu to cpu    (B)
    if not NULL
       write 1 to cpu->exit_request
       release barrier                     (A)
       write 1 to cpu->tcg_exit_req

The CPU thread does either this:

    (in generated code) read cpu->tcg_exit_req
    acquire barrier                        (A)
    read cpu->exit_request
    exit from cpu_exec
    seq_cst write 0 to exit_request
    ...
    flush_queued_work or similar

or this:

    seq_cst write to tcg_current_cpu       (B)
    seq_cst read from exit_request         (C)
    exit from cpu_exec
    seq_cst write 0 to exit_request
    ...
    flush_queued_work or similar

The non-TLS tcg_current_cpu will go away with multi-threaded TCG.

Paolo

Paolo Bonzini (9):
  i8257: rewrite DMA_schedule to avoid hooking into the CPU loop
  i8257: remove cpu_request_exit irq
  tcg: introduce tcg_current_cpu
  remove qemu/tls.h
  tcg: assign cpu->current_tb in a simpler place
  tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses
  tcg: synchronize exit_request and tcg_current_cpu accesses
  use qemu_cpu_kick instead of cpu_exit or qemu_cpu_kick_thread
  tcg: signal-free qemu_cpu_kick

 cpu-exec.c              | 33 ++++++++----------
 cpus.c                  | 91 ++++++++++++++-----------------------------------
 exec.c                  |  2 +-
 gdbstub.c               |  2 +-
 hw/block/fdc.c          |  2 +-
 hw/dma/i82374.c         |  5 +--
 hw/dma/i8257.c          | 31 +++++++++--------
 hw/i386/pc.c            | 13 +------
 hw/isa/i82378.c         |  3 +-
 hw/mips/mips_fulong2e.c | 13 +------
 hw/mips/mips_jazz.c     | 13 +------
 hw/mips/mips_malta.c    | 13 +------
 hw/ppc/prep.c           | 11 ------
 hw/ppc/spapr_rtas.c     |  2 +-
 hw/sparc/sun4m.c        |  4 +--
 hw/sparc64/sun4u.c      |  4 +--
 include/exec/exec-all.h |  5 +--
 include/hw/isa/isa.h    |  4 +--
 include/qemu/tls.h      | 52 ----------------------------
 include/qom/cpu.h       |  8 ++---
 qom/cpu.c               |  2 ++
 21 files changed, 80 insertions(+), 233 deletions(-)
 delete mode 100644 include/qemu/tls.h

-- 
2.4.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 1/9] i8257: rewrite DMA_schedule to avoid hooking into the CPU loop
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 2/9] i8257: remove cpu_request_exit irq Paolo Bonzini
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

The i8257 DMA controller uses an idle bottom half, which by default
does not cause the main loop to exit.  Therefore, the DMA_schedule
function is there to ensure that the CPU relinquishes the iothread
mutex to the iothread.

However, this is not enough since the iothread will call
aio_compute_timeout() and go to sleep again.  In the iothread
world, forcing execution of the idle bottom half is much simpler,
and only requires a call to qemu_notify_event().  Do it, removing
the need for the "cpu_request_exit" pseudo-irq.  The next patch
will remove it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/block/fdc.c       |  2 +-
 hw/dma/i8257.c       | 18 ++++++++++++------
 hw/sparc/sun4m.c     |  2 +-
 hw/sparc64/sun4u.c   |  2 +-
 include/hw/isa/isa.h |  2 +-
 5 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 5e1b67e..6686a72 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -1417,7 +1417,7 @@ static void fdctrl_start_transfer(FDCtrl *fdctrl, int direction)
                  * recall us...
                  */
                 DMA_hold_DREQ(fdctrl->dma_chann);
-                DMA_schedule(fdctrl->dma_chann);
+                DMA_schedule();
             } else {
                 /* Start transfer */
                 fdctrl_transfer_handler(fdctrl, fdctrl->dma_chann, 0,
diff --git a/hw/dma/i8257.c b/hw/dma/i8257.c
index a414029..409ba7d 100644
--- a/hw/dma/i8257.c
+++ b/hw/dma/i8257.c
@@ -358,6 +358,7 @@ static void channel_run (int ncont, int ichan)
 }
 
 static QEMUBH *dma_bh;
+static bool dma_bh_scheduled;
 
 static void DMA_run (void)
 {
@@ -390,12 +391,15 @@ static void DMA_run (void)
 
     running = 0;
 out:
-    if (rearm)
+    if (rearm) {
         qemu_bh_schedule_idle(dma_bh);
+        dma_bh_scheduled = true;
+    }
 }
 
 static void DMA_run_bh(void *unused)
 {
+    dma_bh_scheduled = false;
     DMA_run();
 }
 
@@ -458,12 +462,14 @@ int DMA_write_memory (int nchan, void *buf, int pos, int len)
     return len;
 }
 
-/* request the emulator to transfer a new DMA memory block ASAP */
-void DMA_schedule(int nchan)
+/* request the emulator to transfer a new DMA memory block ASAP (even
+ * if the idle bottom half would not have exited the iothread yet).
+ */
+void DMA_schedule(void)
 {
-    struct dma_cont *d = &dma_controllers[nchan > 3];
-
-    qemu_irq_pulse(*d->cpu_request_exit);
+    if (dma_bh_scheduled) {
+        qemu_notify_event();
+    }
 }
 
 static void dma_reset(void *opaque)
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index 68ac4d8..ebaae9d 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -109,7 +109,7 @@ int DMA_write_memory (int nchan, void *buf, int pos, int size)
 }
 void DMA_hold_DREQ (int nchan) {}
 void DMA_release_DREQ (int nchan) {}
-void DMA_schedule(int nchan) {}
+void DMA_schedule(void) {}
 
 void DMA_init(int high_page_enable, qemu_irq *cpu_request_exit)
 {
diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
index 30cfa0e..44eb4eb 100644
--- a/hw/sparc64/sun4u.c
+++ b/hw/sparc64/sun4u.c
@@ -112,7 +112,7 @@ int DMA_write_memory (int nchan, void *buf, int pos, int size)
 }
 void DMA_hold_DREQ (int nchan) {}
 void DMA_release_DREQ (int nchan) {}
-void DMA_schedule(int nchan) {}
+void DMA_schedule(void) {}
 
 void DMA_init(int high_page_enable, qemu_irq *cpu_request_exit)
 {
diff --git a/include/hw/isa/isa.h b/include/hw/isa/isa.h
index f21ceaa..81b94ea 100644
--- a/include/hw/isa/isa.h
+++ b/include/hw/isa/isa.h
@@ -112,7 +112,7 @@ int DMA_read_memory (int nchan, void *buf, int pos, int size);
 int DMA_write_memory (int nchan, void *buf, int pos, int size);
 void DMA_hold_DREQ (int nchan);
 void DMA_release_DREQ (int nchan);
-void DMA_schedule(int nchan);
+void DMA_schedule(void);
 void DMA_init(int high_page_enable, qemu_irq *cpu_request_exit);
 void DMA_register_channel (int nchan,
                            DMA_transfer_handler transfer_handler,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 2/9] i8257: remove cpu_request_exit irq
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 1/9] i8257: rewrite DMA_schedule to avoid hooking into the CPU loop Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 3/9] tcg: introduce tcg_current_cpu Paolo Bonzini
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

This is unused.  cpu_exit now is almost exclusively an internal function
to the CPU execution loop.  The next patch will change the remaining
occurrences to qemu_cpu_kick, making it truly internal.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/dma/i82374.c         |  5 +----
 hw/dma/i8257.c          | 13 ++++---------
 hw/i386/pc.c            | 13 +------------
 hw/isa/i82378.c         |  3 +--
 hw/mips/mips_fulong2e.c | 13 +------------
 hw/mips/mips_jazz.c     | 13 +------------
 hw/mips/mips_malta.c    | 13 +------------
 hw/ppc/prep.c           | 11 -----------
 hw/sparc/sun4m.c        |  2 +-
 hw/sparc64/sun4u.c      |  2 +-
 include/hw/isa/isa.h    |  2 +-
 11 files changed, 13 insertions(+), 77 deletions(-)

diff --git a/hw/dma/i82374.c b/hw/dma/i82374.c
index b8ad2e6..f630971 100644
--- a/hw/dma/i82374.c
+++ b/hw/dma/i82374.c
@@ -38,7 +38,6 @@ do { fprintf(stderr, "i82374 ERROR: " fmt , ## __VA_ARGS__); } while (0)
 
 typedef struct I82374State {
     uint8_t commands[8];
-    qemu_irq out;
     PortioList port_list;
 } I82374State;
 
@@ -101,7 +100,7 @@ static uint32_t i82374_read_descriptor(void *opaque, uint32_t nport)
 
 static void i82374_realize(I82374State *s, Error **errp)
 {
-    DMA_init(1, &s->out);
+    DMA_init(1);
     memset(s->commands, 0, sizeof(s->commands));
 }
 
@@ -145,8 +144,6 @@ static void i82374_isa_realize(DeviceState *dev, Error **errp)
                     isa->iobase);
 
     i82374_realize(s, errp);
-
-    qdev_init_gpio_out(dev, &s->out, 1);
 }
 
 static Property i82374_properties[] = {
diff --git a/hw/dma/i8257.c b/hw/dma/i8257.c
index 409ba7d..1398424 100644
--- a/hw/dma/i8257.c
+++ b/hw/dma/i8257.c
@@ -59,7 +59,6 @@ static struct dma_cont {
     uint8_t flip_flop;
     int dshift;
     struct dma_regs regs[4];
-    qemu_irq *cpu_request_exit;
     MemoryRegion channel_io;
     MemoryRegion cont_io;
 } dma_controllers[2];
@@ -521,13 +520,11 @@ static const MemoryRegionOps cont_io_ops = {
 
 /* dshift = 0: 8 bit DMA, 1 = 16 bit DMA */
 static void dma_init2(struct dma_cont *d, int base, int dshift,
-                      int page_base, int pageh_base,
-                      qemu_irq *cpu_request_exit)
+                      int page_base, int pageh_base)
 {
     int i;
 
     d->dshift = dshift;
-    d->cpu_request_exit = cpu_request_exit;
 
     memory_region_init_io(&d->channel_io, NULL, &channel_io_ops, d,
                           "dma-chan", 8 << d->dshift);
@@ -591,12 +588,10 @@ static const VMStateDescription vmstate_dma = {
     }
 };
 
-void DMA_init(int high_page_enable, qemu_irq *cpu_request_exit)
+void DMA_init(int high_page_enable)
 {
-    dma_init2(&dma_controllers[0], 0x00, 0, 0x80,
-              high_page_enable ? 0x480 : -1, cpu_request_exit);
-    dma_init2(&dma_controllers[1], 0xc0, 1, 0x88,
-              high_page_enable ? 0x488 : -1, cpu_request_exit);
+    dma_init2(&dma_controllers[0], 0x00, 0, 0x80, high_page_enable ? 0x480 : -1);
+    dma_init2(&dma_controllers[1], 0xc0, 1, 0x88, high_page_enable ? 0x488 : -1);
     vmstate_register (NULL, 0, &vmstate_dma, &dma_controllers[0]);
     vmstate_register (NULL, 1, &vmstate_dma, &dma_controllers[1]);
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 9f2924e..16d06c6 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1446,15 +1446,6 @@ DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus)
     return dev;
 }
 
-static void cpu_request_exit(void *opaque, int irq, int level)
-{
-    CPUState *cpu = current_cpu;
-
-    if (cpu && level) {
-        cpu_exit(cpu);
-    }
-}
-
 static const MemoryRegionOps ioport80_io_ops = {
     .write = ioport80_write,
     .read = ioport80_read,
@@ -1489,7 +1480,6 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
     qemu_irq rtc_irq = NULL;
     qemu_irq *a20_line;
     ISADevice *i8042, *port92, *vmmouse, *pit = NULL;
-    qemu_irq *cpu_exit_irq;
     MemoryRegion *ioport80_io = g_new(MemoryRegion, 1);
     MemoryRegion *ioportF0_io = g_new(MemoryRegion, 1);
 
@@ -1566,8 +1556,7 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
     port92 = isa_create_simple(isa_bus, "port92");
     port92_init(port92, &a20_line[1]);
 
-    cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
-    DMA_init(0, cpu_exit_irq);
+    DMA_init(0);
 
     for(i = 0; i < MAX_FD; i++) {
         fd[i] = drive_get(IF_FLOPPY, 0, i);
diff --git a/hw/isa/i82378.c b/hw/isa/i82378.c
index fcf97d8..d4c8306 100644
--- a/hw/isa/i82378.c
+++ b/hw/isa/i82378.c
@@ -100,7 +100,6 @@ static void i82378_realize(PCIDevice *pci, Error **errp)
 
     /* 2 82C37 (dma) */
     isa = isa_create_simple(isabus, "i82374");
-    qdev_connect_gpio_out(DEVICE(isa), 0, s->out[1]);
 
     /* timer */
     isa_create_simple(isabus, "mc146818rtc");
@@ -111,7 +110,7 @@ static void i82378_init(Object *obj)
     DeviceState *dev = DEVICE(obj);
     I82378State *s = I82378(obj);
 
-    qdev_init_gpio_out(dev, s->out, 2);
+    qdev_init_gpio_out(dev, s->out, 1);
     qdev_init_gpio_in(dev, i82378_request_pic_irq, 16);
 }
 
diff --git a/hw/mips/mips_fulong2e.c b/hw/mips/mips_fulong2e.c
index dea941a..6d2ea30 100644
--- a/hw/mips/mips_fulong2e.c
+++ b/hw/mips/mips_fulong2e.c
@@ -251,15 +251,6 @@ static void network_init (PCIBus *pci_bus)
     }
 }
 
-static void cpu_request_exit(void *opaque, int irq, int level)
-{
-    CPUState *cpu = current_cpu;
-
-    if (cpu && level) {
-        cpu_exit(cpu);
-    }
-}
-
 static void mips_fulong2e_init(MachineState *machine)
 {
     ram_addr_t ram_size = machine->ram_size;
@@ -274,7 +265,6 @@ static void mips_fulong2e_init(MachineState *machine)
     long bios_size;
     int64_t kernel_entry;
     qemu_irq *i8259;
-    qemu_irq *cpu_exit_irq;
     PCIBus *pci_bus;
     ISABus *isa_bus;
     I2CBus *smbus;
@@ -375,8 +365,7 @@ static void mips_fulong2e_init(MachineState *machine)
 
     /* init other devices */
     pit = pit_init(isa_bus, 0x40, 0, NULL);
-    cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
-    DMA_init(0, cpu_exit_irq);
+    DMA_init(0);
 
     /* Super I/O */
     isa_create_simple(isa_bus, "i8042");
diff --git a/hw/mips/mips_jazz.c b/hw/mips/mips_jazz.c
index 9d60633..3906016 100644
--- a/hw/mips/mips_jazz.c
+++ b/hw/mips/mips_jazz.c
@@ -104,15 +104,6 @@ static const MemoryRegionOps dma_dummy_ops = {
 #define MAGNUM_BIOS_SIZE_MAX 0x7e000
 #define MAGNUM_BIOS_SIZE (BIOS_SIZE < MAGNUM_BIOS_SIZE_MAX ? BIOS_SIZE : MAGNUM_BIOS_SIZE_MAX)
 
-static void cpu_request_exit(void *opaque, int irq, int level)
-{
-    CPUState *cpu = current_cpu;
-
-    if (cpu && level) {
-        cpu_exit(cpu);
-    }
-}
-
 static CPUUnassignedAccess real_do_unassigned_access;
 static void mips_jazz_do_unassigned_access(CPUState *cpu, hwaddr addr,
                                            bool is_write, bool is_exec,
@@ -150,7 +141,6 @@ static void mips_jazz_init(MachineState *machine,
     ISADevice *pit;
     DriveInfo *fds[MAX_FD];
     qemu_irq esp_reset, dma_enable;
-    qemu_irq *cpu_exit_irq;
     MemoryRegion *ram = g_new(MemoryRegion, 1);
     MemoryRegion *bios = g_new(MemoryRegion, 1);
     MemoryRegion *bios2 = g_new(MemoryRegion, 1);
@@ -234,8 +224,7 @@ static void mips_jazz_init(MachineState *machine,
     /* ISA devices */
     i8259 = i8259_init(isa_bus, env->irq[4]);
     isa_bus_irqs(isa_bus, i8259);
-    cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
-    DMA_init(0, cpu_exit_irq);
+    DMA_init(0);
     pit = pit_init(isa_bus, 0x40, 0, NULL);
     pcspk_init(isa_bus, pit);
 
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 3082e75..23b6fc3 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -905,15 +905,6 @@ static void main_cpu_reset(void *opaque)
     }
 }
 
-static void cpu_request_exit(void *opaque, int irq, int level)
-{
-    CPUState *cpu = current_cpu;
-
-    if (cpu && level) {
-        cpu_exit(cpu);
-    }
-}
-
 static
 void mips_malta_init(MachineState *machine)
 {
@@ -939,7 +930,6 @@ void mips_malta_init(MachineState *machine)
     MIPSCPU *cpu;
     CPUMIPSState *env;
     qemu_irq *isa_irq;
-    qemu_irq *cpu_exit_irq;
     int piix4_devfn;
     I2CBus *smbus;
     int i;
@@ -1175,8 +1165,7 @@ void mips_malta_init(MachineState *machine)
     smbus_eeprom_init(smbus, 8, smbus_eeprom_buf, smbus_eeprom_size);
     g_free(smbus_eeprom_buf);
     pit = pit_init(isa_bus, 0x40, 0, NULL);
-    cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
-    DMA_init(0, cpu_exit_irq);
+    DMA_init(0);
 
     /* Super I/O */
     isa_create_simple(isa_bus, "i8042");
diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index 45b5f62..81f0838 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -336,15 +336,6 @@ static uint32_t PREP_io_800_readb (void *opaque, uint32_t addr)
 
 #define NVRAM_SIZE        0x2000
 
-static void cpu_request_exit(void *opaque, int irq, int level)
-{
-    CPUState *cpu = current_cpu;
-
-    if (cpu && level) {
-        cpu_exit(cpu);
-    }
-}
-
 static void ppc_prep_reset(void *opaque)
 {
     PowerPCCPU *cpu = opaque;
@@ -626,8 +617,6 @@ static void ppc_prep_init(MachineState *machine)
     cpu = POWERPC_CPU(first_cpu);
     qdev_connect_gpio_out(&pci->qdev, 0,
                           cpu->env.irq_inputs[PPC6xx_INPUT_INT]);
-    qdev_connect_gpio_out(&pci->qdev, 1,
-                          qemu_allocate_irq(cpu_request_exit, NULL, 0));
     sysbus_connect_irq(&pcihost->busdev, 0, qdev_get_gpio_in(&pci->qdev, 9));
     sysbus_connect_irq(&pcihost->busdev, 1, qdev_get_gpio_in(&pci->qdev, 11));
     sysbus_connect_irq(&pcihost->busdev, 2, qdev_get_gpio_in(&pci->qdev, 9));
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index ebaae9d..b5db8b7 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -111,7 +111,7 @@ void DMA_hold_DREQ (int nchan) {}
 void DMA_release_DREQ (int nchan) {}
 void DMA_schedule(void) {}
 
-void DMA_init(int high_page_enable, qemu_irq *cpu_request_exit)
+void DMA_init(int high_page_enable)
 {
 }
 
diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
index 44eb4eb..a887a86 100644
--- a/hw/sparc64/sun4u.c
+++ b/hw/sparc64/sun4u.c
@@ -114,7 +114,7 @@ void DMA_hold_DREQ (int nchan) {}
 void DMA_release_DREQ (int nchan) {}
 void DMA_schedule(void) {}
 
-void DMA_init(int high_page_enable, qemu_irq *cpu_request_exit)
+void DMA_init(int high_page_enable)
 {
 }
 
diff --git a/include/hw/isa/isa.h b/include/hw/isa/isa.h
index 81b94ea..d758b39 100644
--- a/include/hw/isa/isa.h
+++ b/include/hw/isa/isa.h
@@ -113,7 +113,7 @@ int DMA_write_memory (int nchan, void *buf, int pos, int size);
 void DMA_hold_DREQ (int nchan);
 void DMA_release_DREQ (int nchan);
 void DMA_schedule(void);
-void DMA_init(int high_page_enable, qemu_irq *cpu_request_exit);
+void DMA_init(int high_page_enable);
 void DMA_register_channel (int nchan,
                            DMA_transfer_handler transfer_handler,
                            void *opaque);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 3/9] tcg: introduce tcg_current_cpu
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 1/9] i8257: rewrite DMA_schedule to avoid hooking into the CPU loop Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 2/9] i8257: remove cpu_request_exit irq Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 4/9] remove qemu/tls.h Paolo Bonzini
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

This is already useful on Windows in order to remove tls.h, because
accesses to current_cpu are done from a different thread on that
platform.  It will be used on POSIX platforms as soon TCG stops using
signals to interrupt the execution of translated code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c              | 14 +++++---------
 cpus.c                  |  5 +++--
 include/exec/exec-all.h |  1 +
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 58144b4..2c3cb7d 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -382,6 +382,7 @@ static void cpu_handle_debug_exception(CPUState *cpu)
 /* main execution loop */
 
 volatile sig_atomic_t exit_request;
+CPUState *tcg_current_cpu;
 
 int cpu_exec(CPUState *cpu)
 {
@@ -405,15 +406,7 @@ int cpu_exec(CPUState *cpu)
     }
 
     current_cpu = cpu;
-
-    /* As long as current_cpu is null, up to the assignment just above,
-     * requests by other threads to exit the execution loop are expected to
-     * be issued using the exit_request global. We must make sure that our
-     * evaluation of the global value is performed past the current_cpu
-     * value transition point, which requires a memory barrier as well as
-     * an instruction scheduling constraint on modern architectures.  */
-    smp_mb();
-
+    atomic_mb_set(&tcg_current_cpu, cpu);
     rcu_read_lock();
 
     if (unlikely(exit_request)) {
@@ -614,5 +607,8 @@ int cpu_exec(CPUState *cpu)
 
     /* fail safe : never use current_cpu outside cpu_exec() */
     current_cpu = NULL;
+
+    /* Does not need atomic_mb_set because a spurious wakeup is okay.  */
+    atomic_set(&tcg_current_cpu, NULL);
     return ret;
 }
diff --git a/cpus.c b/cpus.c
index 8884278..ec8168c 100644
--- a/cpus.c
+++ b/cpus.c
@@ -663,8 +663,9 @@ static void cpu_handle_guest_debug(CPUState *cpu)
 
 static void cpu_signal(int sig)
 {
-    if (current_cpu) {
-        cpu_exit(current_cpu);
+    CPUState *cpu = atomic_mb_read(&tcg_current_cpu);
+    if (cpu) {
+        cpu_exit(cpu);
     }
     exit_request = 1;
 }
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index b3f900a..c92d434 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -346,6 +346,7 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, target_ulong addr);
 extern int singlestep;
 
 /* cpu-exec.c */
+extern CPUState *tcg_current_cpu;
 extern volatile sig_atomic_t exit_request;
 
 #endif
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 4/9] remove qemu/tls.h
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
                   ` (2 preceding siblings ...)
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 3/9] tcg: introduce tcg_current_cpu Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 5/9] tcg: assign cpu->current_tb in a simpler place Paolo Bonzini
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

TLS is now required on all platforms, so DECLARE_TLS/DEFINE_TLS is not
needed anymore.  Removing it does not break Windows because of the
previous patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c             |  2 +-
 include/qemu/tls.h | 52 ----------------------------------------------------
 include/qom/cpu.h  |  4 +---
 3 files changed, 2 insertions(+), 56 deletions(-)
 delete mode 100644 include/qemu/tls.h

diff --git a/exec.c b/exec.c
index 54cd70a..0f2feb9 100644
--- a/exec.c
+++ b/exec.c
@@ -90,7 +90,7 @@ static MemoryRegion io_mem_unassigned;
 struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus);
 /* current CPU in the current thread. It is only valid inside
    cpu_exec() */
-DEFINE_TLS(CPUState *, current_cpu);
+__thread CPUState *current_cpu;
 /* 0 = Do not count executed instructions.
    1 = Precise instruction counting.
    2 = Adaptive rate instruction counting.  */
diff --git a/include/qemu/tls.h b/include/qemu/tls.h
deleted file mode 100644
index b92ea9d..0000000
--- a/include/qemu/tls.h
+++ /dev/null
@@ -1,52 +0,0 @@
-/*
- * Abstraction layer for defining and using TLS variables
- *
- * Copyright (c) 2011 Red Hat, Inc
- * Copyright (c) 2011 Linaro Limited
- *
- * Authors:
- *  Paolo Bonzini <pbonzini@redhat.com>
- *  Peter Maydell <peter.maydell@linaro.org>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation; either version 2 of
- * the License, or (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License along
- * with this program; if not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef QEMU_TLS_H
-#define QEMU_TLS_H
-
-/* Per-thread variables. Note that we only have implementations
- * which are really thread-local on Linux; the dummy implementations
- * define plain global variables.
- *
- * This means that for the moment use should be restricted to
- * per-VCPU variables, which are OK because:
- *  - the only -user mode supporting multiple VCPU threads is linux-user
- *  - TCG system mode is single-threaded regarding VCPUs
- *  - KVM system mode is multi-threaded but limited to Linux
- *
- * TODO: proper implementations via Win32 .tls sections and
- * POSIX pthread_getspecific.
- */
-#ifdef __linux__
-#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
-#define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
-#define tls_var(x)           tls__##x
-#else
-/* Dummy implementations which define plain global variables */
-#define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
-#define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
-#define tls_var(x)           tls__##x
-#endif
-
-#endif
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 77bbff2..2f9be7d 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -28,7 +28,6 @@
 #include "exec/memattrs.h"
 #include "qemu/queue.h"
 #include "qemu/thread.h"
-#include "qemu/tls.h"
 #include "qemu/typedefs.h"
 
 typedef int (*WriteCoreDumpFunction)(const void *buf, size_t size,
@@ -337,8 +336,7 @@ extern struct CPUTailQ cpus;
     QTAILQ_FOREACH_REVERSE(cpu, &cpus, CPUTailQ, node)
 #define first_cpu QTAILQ_FIRST(&cpus)
 
-DECLARE_TLS(CPUState *, current_cpu);
-#define current_cpu tls_var(current_cpu)
+extern __thread CPUState *current_cpu;
 
 /**
  * cpu_paging_enabled:
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 5/9] tcg: assign cpu->current_tb in a simpler place
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
                   ` (3 preceding siblings ...)
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 4/9] remove qemu/tls.h Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 6/9] tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses Paolo Bonzini
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

TCG has not been reading cpu->current_tb from signal handlers for years.
The code that synchronized cpu_exec with the signal handler is not
needed anymore.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 2c3cb7d..7fcc46f 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -528,17 +528,13 @@ int cpu_exec(CPUState *cpu)
                                 next_tb & TB_EXIT_MASK, tb);
                 }
                 tb_unlock();
-                /* cpu_interrupt might be called while translating the
-                   TB, but before it is linked into a potentially
-                   infinite loop and becomes env->current_tb. Avoid
-                   starting execution if there is a pending interrupt. */
-                cpu->current_tb = tb;
-                barrier();
                 if (likely(!cpu->exit_request)) {
                     trace_exec_tb(tb, tb->pc);
                     tc_ptr = tb->tc_ptr;
                     /* execute the generated code */
+                    cpu->current_tb = tb;
                     next_tb = cpu_tb_exec(cpu, tc_ptr);
+                    cpu->current_tb = NULL;
                     switch (next_tb & TB_EXIT_MASK) {
                     case TB_EXIT_REQUESTED:
                         /* Something asked us to stop executing
@@ -581,7 +577,6 @@ int cpu_exec(CPUState *cpu)
                         break;
                     }
                 }
-                cpu->current_tb = NULL;
                 /* Try to align the host and virtual clocks
                    if the guest is in advance */
                 align_clocks(&sc, cpu);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 6/9] tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
                   ` (4 preceding siblings ...)
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 5/9] tcg: assign cpu->current_tb in a simpler place Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-28  2:19   ` Emilio G. Cota
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 7/9] tcg: synchronize exit_request and tcg_current_cpu accesses Paolo Bonzini
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c | 6 +++++-
 qom/cpu.c  | 2 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 7fcc46f..2128bf1 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -542,8 +542,12 @@ int cpu_exec(CPUState *cpu)
                          * loop. Whatever requested the exit will also
                          * have set something else (eg exit_request or
                          * interrupt_request) which we will handle
-                         * next time around the loop.
+                         * next time around the loop.  But we need to
+                         * ensure the tcg_exit_req read in generated code
+                         * comes before the next read of cpu->exit_request
+                         * or cpu->interrupt_request.
                          */
+                        smp_rmb();
                         next_tb = 0;
                         break;
                     case TB_EXIT_ICOUNT_EXPIRED:
diff --git a/qom/cpu.c b/qom/cpu.c
index 3e93223..3841f0d 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -114,6 +114,8 @@ void cpu_reset_interrupt(CPUState *cpu, int mask)
 void cpu_exit(CPUState *cpu)
 {
     cpu->exit_request = 1;
+    /* Ensure cpu_exec will see the exit request after TCG has exited.  */
+    smp_wmb();
     cpu->tcg_exit_req = 1;
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 7/9] tcg: synchronize exit_request and tcg_current_cpu accesses
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
                   ` (5 preceding siblings ...)
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 6/9] tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 8/9] use qemu_cpu_kick instead of cpu_exit or qemu_cpu_kick_thread Paolo Bonzini
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

Synchronize the remaining pair of accesses in cpu_signal.  The
wrongly-ordered accesses in cpu_signal are currently not an issue on
Windows because they execute atomically between SuspendProcess and
ResumeProcess.  Only cpu_exec can be split (and the newly introduced
atomic_mb_read would be needed on Windows too, but the compiler
must not be doing strange optimizations).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c |  2 +-
 cpus.c     | 14 ++++++++++----
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 2128bf1..b337506 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -409,7 +409,7 @@ int cpu_exec(CPUState *cpu)
     atomic_mb_set(&tcg_current_cpu, cpu);
     rcu_read_lock();
 
-    if (unlikely(exit_request)) {
+    if (unlikely(atomic_mb_read(&exit_request))) {
         cpu->exit_request = 1;
     }
 
diff --git a/cpus.c b/cpus.c
index ec8168c..783ef00 100644
--- a/cpus.c
+++ b/cpus.c
@@ -663,11 +663,15 @@ static void cpu_handle_guest_debug(CPUState *cpu)
 
 static void cpu_signal(int sig)
 {
-    CPUState *cpu = atomic_mb_read(&tcg_current_cpu);
+    CPUState *cpu;
+    /* Ensure whatever caused the exit has reached the CPU threads before
+     * writing exit_request.
+     */
+    atomic_mb_set(&exit_request, 1);
+    cpu = atomic_mb_read(&tcg_current_cpu);
     if (cpu) {
         cpu_exit(cpu);
     }
-    exit_request = 1;
 }
 
 #ifdef CONFIG_LINUX
@@ -1074,7 +1078,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
     }
 
     /* process any pending work */
-    exit_request = 1;
+    atomic_mb_set(&exit_request, 1);
 
     while (1) {
         tcg_exec_all();
@@ -1453,7 +1457,9 @@ static void tcg_exec_all(void)
             break;
         }
     }
-    exit_request = 0;
+
+    /* Pairs with smp_wmb in qemu_cpu_kick.  */
+    atomic_mb_set(&exit_request, 0);
 }
 
 void list_cpus(FILE *f, fprintf_function cpu_fprintf, const char *optarg)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 8/9] use qemu_cpu_kick instead of cpu_exit or qemu_cpu_kick_thread
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
                   ` (6 preceding siblings ...)
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 7/9] tcg: synchronize exit_request and tcg_current_cpu accesses Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 9/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
  2015-08-28  4:21 ` [Qemu-devel] [PATCH v2 0/9] " Richard Henderson
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

Use the same API to trigger interruption of a CPU, no matter if
under TCG or KVM.  There is no difference: these calls come from
the CPU thread, so the qemu_cpu_kick calls will send a signal
to the running thread and it will be processed synchronously,
just like a call to cpu_exit.  The only difference is in the
overhead, but neither call to cpu_exit (now qemu_cpu_kick)
is in a hot path.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpus.c              | 17 ++++++++---------
 gdbstub.c           |  2 +-
 hw/ppc/spapr_rtas.c |  2 +-
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/cpus.c b/cpus.c
index 783ef00..8243b2c 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1101,6 +1101,12 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
 #ifndef _WIN32
     int err;
 
+    if (!tcg_enabled()) {
+        if (cpu->thread_kicked) {
+            return;
+        }
+        cpu->thread_kicked = true;
+    }
     err = pthread_kill(cpu->thread->thread, SIG_IPI);
     if (err) {
         fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
@@ -1138,21 +1144,14 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
 void qemu_cpu_kick(CPUState *cpu)
 {
     qemu_cond_broadcast(cpu->halt_cond);
-    if (!tcg_enabled() && !cpu->thread_kicked) {
-        qemu_cpu_kick_thread(cpu);
-        cpu->thread_kicked = true;
-    }
+    qemu_cpu_kick_thread(cpu);
 }
 
 void qemu_cpu_kick_self(void)
 {
 #ifndef _WIN32
     assert(current_cpu);
-
-    if (!current_cpu->thread_kicked) {
-        qemu_cpu_kick_thread(current_cpu);
-        current_cpu->thread_kicked = true;
-    }
+    qemu_cpu_kick_thread(current_cpu);
 #else
     abort();
 #endif
diff --git a/gdbstub.c b/gdbstub.c
index ffe7e6e..a5a173a 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -1362,7 +1362,7 @@ void gdb_do_syscall(gdb_syscall_complete_cb cb, const char *fmt, ...)
        is still in the running state, which can cause packets to be dropped
        and state transition 'T' packets to be sent while the syscall is still
        being processed.  */
-    cpu_exit(s->c_cpu);
+    qemu_cpu_kick(s->c_cpu);
 #endif
 }
 
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 2986f94..9869bc9 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -214,7 +214,7 @@ static void rtas_stop_self(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     CPUPPCState *env = &cpu->env;
 
     cs->halted = 1;
-    cpu_exit(cs);
+    qemu_cpu_kick(cs);
     /*
      * While stopping a CPU, the guest calls H_CPPR which
      * effectively disables interrupts on XICS level.
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 9/9] tcg: signal-free qemu_cpu_kick
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
                   ` (7 preceding siblings ...)
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 8/9] use qemu_cpu_kick instead of cpu_exit or qemu_cpu_kick_thread Paolo Bonzini
@ 2015-08-26  0:17 ` Paolo Bonzini
  2015-08-28  4:21 ` [Qemu-devel] [PATCH v2 0/9] " Richard Henderson
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-26  0:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: mttcg, cota, rth

Signals are slow and do not exist on Win32.  The previous patches
have done most of the legwork to introduce memory barriers (some
of them were even there already for the sake of Windows!) and
we can now set the flags directly in the iothread.

qemu_cpu_kick_thread is not used anymore on TCG, since the TCG thread is
never outside usermode while the CPU is running (not halted).  Instead run
the content of the signal handler (now in qemu_cpu_kick_no_halt) directly.
qemu_cpu_kick_no_halt is also used in qemu_mutex_lock_iothread to avoid
the overhead of qemu_cond_broadcast.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-exec.c              |  2 +-
 cpus.c                  | 89 ++++++++++++-------------------------------------
 include/exec/exec-all.h |  4 +--
 include/qom/cpu.h       |  4 +--
 4 files changed, 27 insertions(+), 72 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index b337506..41c560e 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -381,7 +381,7 @@ static void cpu_handle_debug_exception(CPUState *cpu)
 
 /* main execution loop */
 
-volatile sig_atomic_t exit_request;
+bool exit_request;
 CPUState *tcg_current_cpu;
 
 int cpu_exec(CPUState *cpu)
diff --git a/cpus.c b/cpus.c
index 8243b2c..105b914 100644
--- a/cpus.c
+++ b/cpus.c
@@ -661,19 +661,6 @@ static void cpu_handle_guest_debug(CPUState *cpu)
     cpu->stopped = true;
 }
 
-static void cpu_signal(int sig)
-{
-    CPUState *cpu;
-    /* Ensure whatever caused the exit has reached the CPU threads before
-     * writing exit_request.
-     */
-    atomic_mb_set(&exit_request, 1);
-    cpu = atomic_mb_read(&tcg_current_cpu);
-    if (cpu) {
-        cpu_exit(cpu);
-    }
-}
-
 #ifdef CONFIG_LINUX
 static void sigbus_reraise(void)
 {
@@ -786,29 +773,11 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
     }
 }
 
-static void qemu_tcg_init_cpu_signals(void)
-{
-    sigset_t set;
-    struct sigaction sigact;
-
-    memset(&sigact, 0, sizeof(sigact));
-    sigact.sa_handler = cpu_signal;
-    sigaction(SIG_IPI, &sigact, NULL);
-
-    sigemptyset(&set);
-    sigaddset(&set, SIG_IPI);
-    pthread_sigmask(SIG_UNBLOCK, &set, NULL);
-}
-
 #else /* _WIN32 */
 static void qemu_kvm_init_cpu_signals(CPUState *cpu)
 {
     abort();
 }
-
-static void qemu_tcg_init_cpu_signals(void)
-{
-}
 #endif /* _WIN32 */
 
 static QemuMutex qemu_global_mutex;
@@ -1057,7 +1026,6 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
     rcu_register_thread();
 
     qemu_mutex_lock_iothread();
-    qemu_tcg_init_cpu_signals();
     qemu_thread_get_self(cpu->thread);
 
     CPU_FOREACH(cpu) {
@@ -1101,60 +1069,47 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
 #ifndef _WIN32
     int err;
 
-    if (!tcg_enabled()) {
-        if (cpu->thread_kicked) {
-            return;
-        }
-        cpu->thread_kicked = true;
+    if (cpu->thread_kicked) {
+        return;
     }
+    cpu->thread_kicked = true;
     err = pthread_kill(cpu->thread->thread, SIG_IPI);
     if (err) {
         fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
         exit(1);
     }
 #else /* _WIN32 */
-    if (!qemu_cpu_is_self(cpu)) {
-        CONTEXT tcgContext;
-
-        if (SuspendThread(cpu->hThread) == (DWORD)-1) {
-            fprintf(stderr, "qemu:%s: GetLastError:%lu\n", __func__,
-                    GetLastError());
-            exit(1);
-        }
-
-        /* On multi-core systems, we are not sure that the thread is actually
-         * suspended until we can get the context.
-         */
-        tcgContext.ContextFlags = CONTEXT_CONTROL;
-        while (GetThreadContext(cpu->hThread, &tcgContext) != 0) {
-            continue;
-        }
-
-        cpu_signal(0);
+    abort();
+#endif
+}
 
-        if (ResumeThread(cpu->hThread) == (DWORD)-1) {
-            fprintf(stderr, "qemu:%s: GetLastError:%lu\n", __func__,
-                    GetLastError());
-            exit(1);
-        }
+static void qemu_cpu_kick_no_halt(void)
+{
+    CPUState *cpu;
+    /* Ensure whatever caused the exit has reached the CPU threads before
+     * writing exit_request.
+     */
+    atomic_mb_set(&exit_request, 1);
+    cpu = atomic_mb_read(&tcg_current_cpu);
+    if (cpu) {
+        cpu_exit(cpu);
     }
-#endif
 }
 
 void qemu_cpu_kick(CPUState *cpu)
 {
     qemu_cond_broadcast(cpu->halt_cond);
-    qemu_cpu_kick_thread(cpu);
+    if (tcg_enabled()) {
+        qemu_cpu_kick_no_halt();
+    } else {
+        qemu_cpu_kick_thread(cpu);
+    }
 }
 
 void qemu_cpu_kick_self(void)
 {
-#ifndef _WIN32
     assert(current_cpu);
     qemu_cpu_kick_thread(current_cpu);
-#else
-    abort();
-#endif
 }
 
 bool qemu_cpu_is_self(CPUState *cpu)
@@ -1186,7 +1141,7 @@ void qemu_mutex_lock_iothread(void)
         atomic_dec(&iothread_requesting_mutex);
     } else {
         if (qemu_mutex_trylock(&qemu_global_mutex)) {
-            qemu_cpu_kick_thread(first_cpu);
+            qemu_cpu_kick_no_halt();
             qemu_mutex_lock(&qemu_global_mutex);
         }
         atomic_dec(&iothread_requesting_mutex);
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index c92d434..228601f 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -345,8 +345,8 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, target_ulong addr);
 /* vl.c */
 extern int singlestep;
 
-/* cpu-exec.c */
+/* cpu-exec.c, accessed with atomic_mb_read/atomic_mb_set */
 extern CPUState *tcg_current_cpu;
-extern volatile sig_atomic_t exit_request;
+extern bool exit_request;
 
 #endif
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 2f9be7d..c3d610b 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -269,7 +269,7 @@ struct CPUState {
     bool created;
     bool stop;
     bool stopped;
-    volatile sig_atomic_t exit_request;
+    bool exit_request;
     uint32_t interrupt_request;
     int singlestep_enabled;
     int64_t icount_extra;
@@ -323,7 +323,7 @@ struct CPUState {
        offset from AREG0.  Leave this field at the end so as to make the
        (absolute value) offset as small as possible.  This reduces code
        size, especially for hosts without large memory offsets.  */
-    volatile sig_atomic_t tcg_exit_req;
+    uint32_t tcg_exit_req;
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 6/9] tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses Paolo Bonzini
@ 2015-08-28  2:19   ` Emilio G. Cota
  2015-08-28  8:50     ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Emilio G. Cota @ 2015-08-28  2:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mttcg, qemu-devel, rth

On Wed, Aug 26, 2015 at 02:17:42 +0200, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cpu-exec.c | 6 +++++-
>  qom/cpu.c  | 2 ++
>  2 files changed, 7 insertions(+), 1 deletion(-)

I like this patch.

Are we making sure that other writes to tcg_exit_req are preceded by
a write barrier? For instance:

qom/cpu.c-116-    cpu->exit_request = 1;
qom/cpu.c:117:    cpu->tcg_exit_req = 1;

translate-all.c-1478-    } else {
translate-all.c:1479:        cpu->tcg_exit_req = 1;

translate-all.c-1643-    cpu->interrupt_request |= mask;
translate-all.c:1644:    cpu->tcg_exit_req = 1;

Current master certainly doesn't have them, but I wonder if
you have barriers at those places in the tree you're working on. If not
I'd expand this patch to add them.

And while at it, can we pleeease get rid of the hideous 'volatile sigatomic_t'
type for tcg_exit_req?

Thanks,

		Emilio

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick
  2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
                   ` (8 preceding siblings ...)
  2015-08-26  0:17 ` [Qemu-devel] [PATCH 9/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
@ 2015-08-28  4:21 ` Richard Henderson
  9 siblings, 0 replies; 13+ messages in thread
From: Richard Henderson @ 2015-08-28  4:21 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: mttcg, cota

On 08/25/2015 05:17 PM, Paolo Bonzini wrote:
> This version of the signal-free qemu_cpu_kick patches is, ehm, much
> better.  Variable are accessed either with Java-style volatiles or
> protected by memory barriers, and the cleanups go further by removing
> qemu/tls.h and C volatiles.
>
> The logic is relatively simple.  The I/O thread does (letters in
> parentheses indicates the synchronizes-with edges):
>
>      run_on_cpu or similar
>      ...
>      seq_cst write 1 to exit_request        (C)
>      seq_cst read tcg_current_cpu to cpu    (B)
>      if not NULL
>         write 1 to cpu->exit_request
>         release barrier                     (A)
>         write 1 to cpu->tcg_exit_req
>
> The CPU thread does either this:
>
>      (in generated code) read cpu->tcg_exit_req
>      acquire barrier                        (A)
>      read cpu->exit_request
>      exit from cpu_exec
>      seq_cst write 0 to exit_request
>      ...
>      flush_queued_work or similar
>
> or this:
>
>      seq_cst write to tcg_current_cpu       (B)
>      seq_cst read from exit_request         (C)
>      exit from cpu_exec
>      seq_cst write 0 to exit_request
>      ...
>      flush_queued_work or similar
>
> The non-TLS tcg_current_cpu will go away with multi-threaded TCG.
>
> Paolo
>
> Paolo Bonzini (9):
>    i8257: rewrite DMA_schedule to avoid hooking into the CPU loop
>    i8257: remove cpu_request_exit irq
>    tcg: introduce tcg_current_cpu
>    remove qemu/tls.h
>    tcg: assign cpu->current_tb in a simpler place
>    tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses
>    tcg: synchronize exit_request and tcg_current_cpu accesses
>    use qemu_cpu_kick instead of cpu_exit or qemu_cpu_kick_thread
>    tcg: signal-free qemu_cpu_kick

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 6/9] tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses
  2015-08-28  2:19   ` Emilio G. Cota
@ 2015-08-28  8:50     ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-08-28  8:50 UTC (permalink / raw)
  To: Emilio G. Cota; +Cc: mttcg, qemu-devel, rth



On 28/08/2015 04:19, Emilio G. Cota wrote:
> translate-all.c-1478-    } else {
> translate-all.c:1479:        cpu->tcg_exit_req = 1;

This one is only run in the CPU thread.

> translate-all.c-1643-    cpu->interrupt_request |= mask;
> translate-all.c:1644:    cpu->tcg_exit_req = 1;

This one is only run in user-mode emulation, which also means it is run
in the CPU thread.

Paolo

> Current master certainly doesn't have them, but I wonder if
> you have barriers at those places in the tree you're working on. If not
> I'd expand this patch to add them.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-08-28  8:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-26  0:17 [Qemu-devel] [PATCH v2 0/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 1/9] i8257: rewrite DMA_schedule to avoid hooking into the CPU loop Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 2/9] i8257: remove cpu_request_exit irq Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 3/9] tcg: introduce tcg_current_cpu Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 4/9] remove qemu/tls.h Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 5/9] tcg: assign cpu->current_tb in a simpler place Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 6/9] tcg: synchronize cpu->exit_request and cpu->tcg_exit_req accesses Paolo Bonzini
2015-08-28  2:19   ` Emilio G. Cota
2015-08-28  8:50     ` Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 7/9] tcg: synchronize exit_request and tcg_current_cpu accesses Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 8/9] use qemu_cpu_kick instead of cpu_exit or qemu_cpu_kick_thread Paolo Bonzini
2015-08-26  0:17 ` [Qemu-devel] [PATCH 9/9] tcg: signal-free qemu_cpu_kick Paolo Bonzini
2015-08-28  4:21 ` [Qemu-devel] [PATCH v2 0/9] " Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).