* Re: [Qemu-devel] Release of COREMU, a scalable and portable full-system emulator
2010-07-21 17:04 ` Stefan Weil
@ 2010-07-22 8:48 ` Chen Yufei
2010-07-22 11:05 ` [Qemu-devel] " Jan Kiszka
2010-07-22 12:18 ` [Qemu-devel] " Stefan Hajnoczi
0 siblings, 2 replies; 20+ messages in thread
From: Chen Yufei @ 2010-07-22 8:48 UTC (permalink / raw)
To: Stefan Weil; +Cc: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 2704 bytes --]
On 2010-7-22, at 上午1:04, Stefan Weil wrote:
> Am 21.07.2010 09:03, schrieb Chen Yufei:
>> On 2010-7-21, at 上午5:43, Blue Swirl wrote:
>>
>>
>>> On Sat, Jul 17, 2010 at 10:27 AM, Chen Yufei<cyfdecyf@gmail.com> wrote:
>>>
>>>> We are pleased to announce COREMU, which is a "multicore-on-multicore" full-system emulator built on Qemu. (Simply speaking, we made Qemu parallel.)
>>>>
>>>> The project web page is located at:
>>>> http://ppi.fudan.edu.cn/coremu
>>>>
>>>> You can also download the source code, images for playing on sourceforge
>>>> http://sf.net/p/coremu
>>>>
>>>> COREMU is composed of
>>>> 1. a parallel emulation library
>>>> 2. a set of patches to qemu
>>>> (We worked on the master branch, commit 54d7cf136f040713095cbc064f62d753bff6f9d2)
>>>>
>>>> It currently supports full-system emulation of x64 and ARM MPcore platforms.
>>>>
>>>> By leveraging the underlying multicore resources, it can emulate up to 255 cores running commodity operating systems (even on a 4-core machine).
>>>>
>>>> Enjoy,
>>>>
>>> Nice work. Do you plan to submit the improvements back to upstream QEMU?
>>>
>> It would be great if we can submit our code to QEMU, but we do not know the process.
>> Would you please give us some instructions?
>>
>> --
>> Best regards,
>> Chen Yufei
>>
>
> Some hints can be found here:
> http://wiki.qemu.org/Contribute/StartHere
>
> Kind regards,
> Stefan Weil
The patch is in the attachment, produced with command
git diff 54d7cf136f040713095cbc064f62d753bff6f9d2
In order to separate what need to be done to make QEMU parallel, we created a separate library, and the patched QEMU need to be compiled and linked with that library. To submit our enhancement to QEMU, maybe we need to incorporate this library into QEMU. I don't know what would be the best solution.
Our approach to make QEMU parallel can be found at http://ppi.fudan.edu.cn/coremu
I will give a short summary here:
1. Each emulated core thread runs a separate binary translator engine and has private code cache. We marked some variables in TCG as thread local. We also modified the TB invalidation mechanism.
2. Each core has a queue holding pending interrupts. The COREMU library provides this queue, and interrupt notification is done by sending realtime signals to the emulated core thread.
3. Atomic instruction emulation has to be modified for parallel emulation. We use lightweight memory transaction which requires only compare-and-swap instruction to emulate atomic instruction.
4. Some code in the original QEMU may cause data race bug after we make it parallel. We fixed these problems.
[-- Attachment #2: patch-to-54d7cf136f040713095cbc064f62d753bff6f9d2 --]
[-- Type: application/octet-stream, Size: 162739 bytes --]
diff --git a/Makefile b/Makefile
index eb9e02b..62419ec 100644
--- a/Makefile
+++ b/Makefile
@@ -135,11 +135,12 @@ iov.o: iov.c iov.h
qemu-img.o: qemu-img-cmds.h
qemu-img.o qemu-tool.o qemu-nbd.o qemu-io.o: $(GENERATED_HEADERS)
-qemu-img$(EXESUF): qemu-img.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y)
+include $(SRC_PATH)/coremu.mk
+qemu-img$(EXESUF): qemu-img.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y) $(COREMU_LIB)
-qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y)
+qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y) $(COREMU_LIB)
-qemu-io$(EXESUF): qemu-io.o cmd.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y)
+qemu-io$(EXESUF): qemu-io.o cmd.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y) $(COREMU_LIB)
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/hxtool -h < $< > $@," GEN $@")
diff --git a/Makefile.target b/Makefile.target
index c092900..aec7f12 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -58,6 +58,9 @@ libobj-$(TARGET_ARM) += neon_helper.o iwmmxt_helper.o
libobj-y += disas.o
+# coremu related object, we may need to split this later.
+libobj-y += cm-loop.o cm-intr.o cm-target-intr.o
+
$(libobj-y): $(GENERATED_HEADERS)
# libqemu
@@ -300,8 +303,11 @@ endif # CONFIG_SOFTMMU
obj-$(CONFIG_GDBSTUB_XML) += gdbstub-xml.o
-$(QEMU_PROG): $(obj-y) $(obj-$(TARGET_BASE_ARCH)-y)
- $(call LINK,$(obj-y) $(obj-$(TARGET_BASE_ARCH)-y))
+# COREMU_LIB is defined in coremu.mk
+include $(SRC_PATH)/coremu.mk
+
+$(QEMU_PROG): $(obj-y) $(obj-$(TARGET_BASE_ARCH)-y) $(COREMU_LIB)
+ $(call LINK,$(obj-y) $(obj-$(TARGET_BASE_ARCH)-y) -ltopology $(COREMU_LIB))
gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/feature_to_c.sh
diff --git a/cm-init.c b/cm-init.c
new file mode 100644
index 0000000..4dd451a
--- /dev/null
+++ b/cm-init.c
@@ -0,0 +1,131 @@
+/*
+ * COREMU Parallel Emulator Framework
+ *
+ * Initialization stuff for qemu.
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* We include this file in exec.c */
+
+#include <sys/types.h>
+#include <sys/mman.h>
+
+#define VERBOSE_COREMU
+#include "sysemu.h"
+#include "coremu-sched.h"
+#include "coremu-debug.h"
+#include "coremu-init.h"
+#include "cm-timer.h"
+#include "cm-init.h"
+
+/* XXX How to clean up the following code? */
+
+/* Since each core uses it's own code buffer, we set a large value here. */
+#undef DEFAULT_CODE_GEN_BUFFER_SIZE
+#define DEFAULT_CODE_GEN_BUFFER_SIZE (800 * 1024 * 1024)
+
+static uint64_t cm_bufsize = 0;
+static void *cm_bufbase = NULL;
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+/* Prepare a large code cache for each CORE to allocate later */
+static void cm_code_gen_alloc_all(void)
+{
+ int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_32BIT;
+
+ /*cm_bufsize = (min(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_ram_size));*/
+ /* XXX what if this is larger than physical ram size? */
+ cm_bufsize = DEFAULT_CODE_GEN_BUFFER_SIZE;
+ cm_bufbase = mmap(NULL, cm_bufsize, PROT_WRITE | PROT_READ | PROT_EXEC,
+ flags, -1, 0);
+
+ if (cm_bufbase == MAP_FAILED) {
+ cm_assert(0, "mmap failed\n");
+ }
+
+ code_gen_buffer_size = (unsigned long)(cm_bufsize / (smp_cpus));
+ cm_assert(code_gen_buffer_size >= MIN_CODE_GEN_BUFFER_SIZE,
+ "code buffer size too small");
+
+ code_gen_buffer_max_size = code_gen_buffer_size - code_gen_max_block_size();
+ code_gen_max_blocks = code_gen_buffer_size / CODE_GEN_AVG_BLOCK_SIZE;
+}
+
+/* From the allocated memory in code_gen_alloc_all, we allocate memory for each
+ * core. */
+static void cm_code_gen_alloc(void)
+{
+ /* We use cpu_index here, note that this maybe not the same as architecture
+ * dependent cpu id. eg. cpuid_apic_id. */
+ code_gen_buffer = cm_bufbase + (code_gen_buffer_size *
+ cpu_single_env->cpu_index);
+
+ /* Allocate space for TBs. */
+ tbs = qemu_malloc(code_gen_max_blocks * sizeof(TranslationBlock));
+
+ /* cm_print("CORE[%u] TC [%lu MB] at %p", cpu_single_env->cpu_index,
+ (code_gen_buffer_size) / (1024 * 1024), code_gen_buffer); */
+}
+
+/* For coremu, code generator related initialization should be called by all
+ * core thread. While other stuff only need to be done in the hardware
+ * thread. */
+void cm_cpu_exec_init(void)
+{
+ page_init();
+ io_mem_init();
+
+ /* Allocate code cache. */
+ cm_code_gen_alloc_all();
+
+ /* Code prologue initialization. */
+ cm_code_prologue_init();
+ map_exec(code_gen_prologue, sizeof(code_gen_prologue));
+}
+
+void cm_cpu_exec_init_core(void)
+{
+ cpu_gen_init();
+ /* Get code cache. */
+ cm_code_gen_alloc();
+ code_gen_ptr = code_gen_buffer;
+
+#if defined(TARGET_I386)
+ optimize_flags_init();
+#elif defined(TARGET_ARM)
+ arm_translate_init();
+#endif
+ /* Setup the scheduling for core thread */
+ coremu_init_sched_core();
+
+ /* Set up ticks mechanism for every core. */
+ cpu_enable_ticks();
+
+ /* Create per core timer. */
+ if (cm_init_local_timer_alarm() < 0) {
+ cm_assert(0, "local alarm initialize failed");
+ }
+
+ /* Wait other core to finish initialization. */
+ coremu_wait_init();
+}
diff --git a/cm-init.h b/cm-init.h
new file mode 100644
index 0000000..e0e1632
--- /dev/null
+++ b/cm-init.h
@@ -0,0 +1,37 @@
+/*
+ * COREMU Parallel Emulator Framework
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _CM_INIT_H
+#define _CM_INIT_H
+
+/* page_init, io_mem_init, etc. Called by hardware thread. */
+void cm_cpu_exec_init(void);
+/* Allocate code buffer for each core. Called by each core. */
+void cm_cpu_exec_init_core(void);
+
+/* This function is defined in tcg/tcg.c */
+void cm_code_prologue_init(void);
+
+#endif /* _CM_INIT_H */
diff --git a/cm-intr.c b/cm-intr.c
new file mode 100644
index 0000000..bcefe36
--- /dev/null
+++ b/cm-intr.c
@@ -0,0 +1,53 @@
+/*
+ * COREMU Parallel Emulator Framework
+ *
+ * The common interface for hardware interrupt sending and handling.
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include <stdlib.h>
+#include <stdio.h>
+#include "cpu.h"
+
+#include "coremu-intr.h"
+#include "coremu-core.h"
+#include "coremu-malloc.h"
+#include "cm-intr.h"
+
+/* The common interface to handle the interrupt, this function should to
+ be registered to coremu */
+void cm_common_intr_handler(CMIntr *intr)
+{
+ coremu_assert_core_thr();
+ if (!intr)
+ return;
+ intr->handler(intr);
+ coremu_free(intr);
+}
+
+/* To notify there is an event coming, what qemu need to do is
+ just exit current cpu loop */
+void cm_notify_event(void)
+{
+ if (cpu_single_env)
+ cpu_exit(cpu_single_env);
+}
diff --git a/cm-intr.h b/cm-intr.h
new file mode 100644
index 0000000..697b18f
--- /dev/null
+++ b/cm-intr.h
@@ -0,0 +1,40 @@
+/*
+ * COREMU Parallel Emulator Framework
+ * Defines qemu related structure and interface for hardware interrupt
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef CM_INTR_H
+#define CM_INTR_H
+
+/* This is the call back function used to handle different type interrupts */
+typedef void (*CMIntr_handler)(void *opaque);
+
+/* Base type for all types of interrupt. Subtype of CMIntr should have an
+ * object of this struct as its first member. */
+typedef struct CMIntr {
+ CMIntr_handler handler;
+} CMIntr;
+
+void cm_common_intr_handler(CMIntr *opaque);
+void cm_notify_event(void);
+#endif
diff --git a/cm-loop.c b/cm-loop.c
new file mode 100644
index 0000000..5d33743
--- /dev/null
+++ b/cm-loop.c
@@ -0,0 +1,95 @@
+/*
+ * COREMU Parallel Emulator Framework
+ * The definition of core thread function
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include "cpu.h"
+#include "cpus.h"
+
+#include "coremu-intr.h"
+#include "coremu-debug.h"
+#include "coremu-sched.h"
+#include "coremu-types.h"
+#include "cm-loop.h"
+#include "cm-timer.h"
+#include "cm-init.h"
+
+static bool cm_tcg_cpu_exec(void);
+static bool cm_tcg_cpu_exec(void)
+{
+ int ret = 0;
+ CPUState *env = cpu_single_env;
+ struct timespec halt_interval;
+ halt_interval.tv_sec = 0;
+ halt_interval.tv_nsec = 10000;
+
+ for (;;) {
+ if (cm_local_alarm_pending())
+ cm_run_all_local_timers();
+
+ coremu_receive_intr();
+ if (cm_cpu_can_run(env))
+ ret = cpu_exec(env);
+ else if (env->stop)
+ break;
+
+ if (!cm_vm_can_run())
+ break;
+
+ if (ret == EXCP_DEBUG) {
+ cm_assert(0, "debug support hasn't been finished\n");
+ break;
+ }
+ if (ret == EXCP_HALTED || ret == EXCP_HLT) {
+ coremu_cpu_sched(CM_EVENT_HALTED);
+ }
+ }
+ return ret;
+}
+
+void *cm_cpu_loop(void *args)
+{
+ int ret;
+
+ /* Must initialize cpu_single_env before initializing core thread. */
+ assert(args);
+ cpu_single_env = (CPUState *)args;
+
+ /* Setup dynamic translator */
+ cm_cpu_exec_init_core();
+
+ for (;;) {
+ ret = cm_tcg_cpu_exec();
+ if (test_reset_request()) {
+ coremu_pause_core();
+ continue;
+ }
+ break;
+ }
+ cm_stop_local_timer();
+ coremu_core_exit(NULL);
+ assert(0);
+}
diff --git a/cm-loop.h b/cm-loop.h
new file mode 100644
index 0000000..d03d13d
--- /dev/null
+++ b/cm-loop.h
@@ -0,0 +1,37 @@
+/*
+ * COREMU Parallel Emulator Framework
+ * The definition of core thread function
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef CM_LOOP_H
+#define CM_LOOP_H
+
+/*#include "cpu.h"*/
+void *cm_cpu_loop(void *args);
+
+/* The wrapper for static function of qemu */
+/*int cm_cpu_can_run(struct CPUState * env);*/
+int cm_vm_can_run(void);
+
+#endif
+
diff --git a/cm-tbinval.c b/cm-tbinval.c
new file mode 100644
index 0000000..a410403
--- /dev/null
+++ b/cm-tbinval.c
@@ -0,0 +1,199 @@
+/*
+ * COREMU Parallel Emulator Framework
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include <assert.h>
+#include "coremu-malloc.h"
+#include "coremu-atomic.h"
+#include "coremu-hw.h"
+
+static uint16_t *cm_phys_tb_cnt;
+
+extern void cm_inject_invalidate_code(TranslationBlock *tb);
+static int cm_invalidate_other(int cpu_id, target_phys_addr_t start, int len);
+
+void cm_init_tb_cnt(ram_addr_t ram_offset, ram_addr_t size)
+{
+ coremu_assert_hw_thr("cm_init_bt_cnt should only called by hw thr");
+
+ cm_phys_tb_cnt = coremu_realloc(cm_phys_tb_cnt,
+ ((ram_offset +
+ size) >> TARGET_PAGE_BITS) *
+ sizeof(uint16_t));
+ memset(cm_phys_tb_cnt + (ram_offset >> TARGET_PAGE_BITS), 0x0,
+ (size >> TARGET_PAGE_BITS) * sizeof(uint16_t));
+}
+
+void cm_phys_add_tb(ram_addr_t addr)
+{
+ atomic_incw(&cm_phys_tb_cnt[addr >> TARGET_PAGE_BITS]);
+}
+
+void cm_phys_del_tb(ram_addr_t addr)
+{
+ assert(cm_phys_tb_cnt[addr >> TARGET_PAGE_BITS]);
+ atomic_decw(&cm_phys_tb_cnt[addr >> TARGET_PAGE_BITS]);
+}
+
+uint16_t cm_phys_page_tb_p(ram_addr_t addr)
+{
+ return cm_phys_tb_cnt[addr >> TARGET_PAGE_BITS];
+}
+
+void cm_invalidate_bitmap(CMPageDesc *p)
+{
+ /* Get the bitmap lock */
+ coremu_spin_lock(&p->bitmap_lock);
+
+ if (p->code_bitmap) {
+ coremu_free(p->code_bitmap);
+ p->code_bitmap = NULL;
+ }
+
+ /* Unlock the bitmap lock */
+ coremu_spin_unlock(&p->bitmap_lock);
+
+}
+
+void cm_invalidate_tb(target_phys_addr_t start, int len)
+{
+ int count = tb_phys_invalidate_count;
+ if (!coremu_hw_thr_p()) {
+ tb_invalidate_phys_page_fast(start, len);
+ count = tb_phys_invalidate_count - count;
+ }
+
+ if ((!cm_phys_page_tb_p(start)) || (cm_phys_page_tb_p(start) == count))
+ goto done;
+
+#ifdef COREMU_CMC_SUPPORT
+ /* XXX: not finish need Lazy invalidate here! */
+ int have_done = count;
+ int cpu_idx = 0;
+ for (cpu_idx = 0; cpu_idx < coremu_get_targetcpu(); cpu_idx++) {
+ if ((!coremu_hw_thr_p()) && cpu_idx == cpu_single_env->cpuid_apic_id)
+ continue;
+ have_done += cm_invalidate_other(cpu_idx, start, len);
+ if (have_done > cm_phys_page_tb_p(start))
+ break;
+ }
+#endif
+
+done:
+ return;
+}
+
+void cm_tlb_reset_dirty_range(CPUTLBEntry *tlb_entry,
+ unsigned long start, unsigned long length)
+{
+ unsigned long addr, old, addend;
+ old = tlb_entry->addr_write;
+ addend = tlb_entry->addend;
+
+ if ((old & ~TARGET_PAGE_MASK) == IO_MEM_RAM) {
+ addr = (old & TARGET_PAGE_MASK) + addend;
+ if ((addr - start) < length) {
+ uint64_t newv = (tlb_entry->addr_write & TARGET_PAGE_MASK) |
+ TLB_NOTDIRTY;
+ atomic_compare_exchangeq(&tlb_entry->addr_write, old, newv);
+ }
+ }
+}
+
+/* Try to Lazy invalidate the TB of CPU[cpu_id]
+ * return 1: successful find and invalidate TB of CPU[cpu_id]
+ * 0: dosn't exist
+ */
+static int cm_lazy_invalidate_tb(TranslationBlock *tbs,
+ target_phys_addr_t start, int len)
+{
+ int n, ret = 0;
+ TranslationBlock *tb_next;
+ TranslationBlock *tb = tbs;
+
+ target_phys_addr_t end = start + len;
+ target_ulong tb_start, tb_end;
+
+ while (tb != NULL) {
+ n = (long)tb & 3;
+ tb = (TranslationBlock *)((long)tb & ~3);
+ tb_next = tb->page_next[n];
+ /* NOTE: this is subtle as a TB may span two physical pages */
+ if (n == 0) {
+ /* NOTE: tb_end may be after the end of the page, but
+ it is not a problem */
+ tb_start = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
+ tb_end = tb_start + tb->size;
+ } else {
+ tb_start = tb->page_addr[1];
+ tb_end = tb_start + ((tb->pc + tb->size) & ~TARGET_PAGE_MASK);
+ }
+ if (!(tb_end <= start || tb_start >= end)) {
+
+ /* change the code cache of the tb */
+ cm_inject_invalidate_code(tb);
+ ret = 1;
+ }
+ tb = tb_next;
+ }
+
+ return ret;
+}
+
+
+/* Try to invalidate the TB of CPU[cpu_id]
+ * return 1: successful find and invalidate TB of CPU[cpu_id]
+ * 0: dosn't exist
+ */
+static int cm_invalidate_other(int cpu_id, target_phys_addr_t start, int len)
+{
+ /* Find if exit any TB intersect with start -- start+len */
+ PageDesc *p = page_find(start >> TARGET_PAGE_BITS);
+ if (!p)
+ return 0;
+
+ int offset, b;
+ uint8_t *bit_map;
+ int need_invalidate = 1;
+ int ret = 0;
+
+ coremu_spin_lock(&p->cpu_tbs[cpu_id].bitmap_lock);
+ bit_map = p->cpu_tbs[cpu_id].code_bitmap;
+ if (bit_map) {
+ offset = start & ~TARGET_PAGE_MASK;
+ b = bit_map[offset >> 3] >> (offset & 7);
+ if (!(b & ((1 << len) - 1)))
+ need_invalidate = 0;
+ }
+ coremu_spin_unlock(&p->cpu_tbs[cpu_id].bitmap_lock);
+
+ if (need_invalidate) {
+ coremu_spin_lock(&p->cpu_tbs[cpu_id].tb_list_lock);
+ //find the code ptr
+ ret = cm_lazy_invalidate_tb(p->cpu_tbs[cpu_id].first_tb, start, len);
+ coremu_spin_unlock(&p->cpu_tbs[cpu_id].tb_list_lock);
+ //change the code here
+ }
+
+ return ret;
+}
diff --git a/cm-tbinval.h b/cm-tbinval.h
new file mode 100644
index 0000000..9a1879a
--- /dev/null
+++ b/cm-tbinval.h
@@ -0,0 +1,51 @@
+/*
+ * COREMU Parallel Emulator Framework
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _CM_TBINVAL_H
+#define _CM_TBINVAL_H
+
+typedef struct {
+ /* List of TBs of this cpu intersecting this ram page */
+ TranslationBlock *first_tb;
+ /* This lock is used to guarantee the other read and self modify conflict */
+ CMSpinLock tb_list_lock;
+
+ /* Use a bitmap to optimize the self modify code */
+ uint8_t *code_bitmap;
+ CMSpinLock bitmap_lock;
+} CMPageDesc;
+
+void cm_init_tb_cnt(ram_addr_t ram_offset, ram_addr_t size);
+void cm_phys_add_tb(ram_addr_t addr);
+void cm_phys_del_tb(ram_addr_t addr);
+uint16_t cm_phys_page_tb_p(ram_addr_t addr);
+
+void cm_invalidate_bitmap(CMPageDesc *p);
+void cm_invalidate_tb(target_phys_addr_t start, int len);
+
+void cm_tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, unsigned long start,
+ unsigned long length);
+
+#endif
diff --git a/cm-timer.c b/cm-timer.c
new file mode 100644
index 0000000..3c6a0a1
--- /dev/null
+++ b/cm-timer.c
@@ -0,0 +1,265 @@
+/*
+ * COREMU Parallel Emulator Framework
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ * Xi Wu <wuxi@fudan.edu.cn>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* We include this file in qemu-timer.c qemu_alarm_timer is defined in it, and
+ * there's lots of static function there. */
+#include "coremu-sched.h"
+#include <math.h>
+int cm_pit_freq;
+
+static int64_t cm_local_next_deadline(void);
+static uint64_t cm_local_next_deadline_dyntick(void);
+static void cm_local_dynticks_rearm_timer(struct qemu_alarm_timer *t);
+static void cm_qemu_run_local_timers(QEMUClock *clock);
+
+COREMU_THREAD QEMUTimer *cm_local_active_timers;
+COREMU_THREAD struct qemu_alarm_timer *cm_local_alarm_timer;
+static COREMU_THREAD struct qemu_alarm_timer cm_local_alarm_timers[] = {
+ {"dynticks", dynticks_start_timer,
+ dynticks_stop_timer, cm_local_dynticks_rearm_timer, NULL},
+ {NULL,}
+};
+
+void cm_init_pit_freq(void)
+{
+ double v_num = coremu_get_targetcpu();
+ double p_num = coremu_get_hostcpu();
+ double p_root = sqrt(p_num) / 4;
+ double suggest = p_root * pow(v_num / p_num, p_root);
+ int pit_freq_suggest = ceil(suggest);
+ cm_pit_freq = 1193182 / pit_freq_suggest;
+}
+
+/* Called by each core thread to create a local timer. */
+int cm_init_local_timer_alarm(void)
+{
+ coremu_assert_core_thr();
+ /* core thr block the Timer Alarm signal */
+ struct qemu_alarm_timer *t = NULL;
+ int i, err = -1;
+
+ for (i = 0; cm_local_alarm_timers[i].name; i++) {
+ t = &cm_local_alarm_timers[i];
+ if (!t)
+ return 0;
+ err = t->start(t);
+ if (!err)
+ break;
+ }
+
+ if (err) {
+ err = -ENOENT;
+ goto fail;
+ }
+
+ /* first event is at time 0 */
+ t->pending = 1;
+ cm_local_alarm_timer = t;
+
+ return 0;
+
+fail:
+ return err;
+}
+
+/* Mode the local virtual timer for core.
+ Because for x86_64, there is only one timer for every core
+ so there is no need to do the link list.
+*/
+void cm_mod_local_timer(QEMUTimer *ts, int64_t expire_time)
+{
+ QEMUTimer **pt, *t;
+
+ cm_del_local_timer(ts);
+
+ /* add the timer in the sorted list */
+ /* NOTE: this code must be signal safe because
+ qemu_timer_expired() can be called from a signal. */
+ pt = &cm_local_active_timers;
+ for (;;) {
+ t = *pt;
+ if (!t)
+ break;
+ if (t->expire_time > expire_time)
+ break;
+ pt = &t->next;
+ }
+ ts->expire_time = expire_time;
+ ts->next = *pt;
+ *pt = ts;
+
+ /* Rearm if necessary */
+ if (pt == &cm_local_active_timers) {
+ if (!cm_local_alarm_timer->pending) {
+ qemu_rearm_alarm_timer(cm_local_alarm_timer);
+ }
+ }
+}
+
+void cm_del_local_timer(QEMUTimer *ts)
+{
+ QEMUTimer **pt, *t;
+
+ /* NOTE: this code must be signal safe because
+ qemu_timer_expired() can be called from a signal. */
+ pt = &cm_local_active_timers;
+ for (;;) {
+ t = *pt;
+ if (!t)
+ break;
+ if (t == ts) {
+ *pt = t->next;
+ break;
+ }
+ pt = &t->next;
+ }
+}
+
+int cm_local_alarm_pending(void)
+{
+ return cm_local_alarm_timer->pending;
+}
+
+void cm_run_all_local_timers(void)
+{
+ cm_local_alarm_timer->pending = 0;
+
+ /* rearm timer, if not periodic */
+ if (cm_local_alarm_timer->expired) {
+ cm_local_alarm_timer->expired = 0;
+ qemu_rearm_alarm_timer(cm_local_alarm_timer);
+ }
+
+ if (vm_running) {
+ cm_qemu_run_local_timers(vm_clock);
+ }
+}
+
+void cm_local_host_alarm_handler(int host_signum)
+{
+ coremu_assert_core_thr();
+
+ struct qemu_alarm_timer *t = cm_local_alarm_timer;
+ if (!t)
+ return;
+
+ if (alarm_has_dynticks(t) ||
+ qemu_timer_expired(cm_local_active_timers, qemu_get_clock(vm_clock))) {
+ t->expired = alarm_has_dynticks(t);
+ t->pending = 1;
+ cm_notify_event();
+ }
+}
+
+static void cm_qemu_run_local_timers(QEMUClock *clock)
+{
+ QEMUTimer **ptimer_head, *ts;
+ int64_t current_time;
+
+ if (!clock->enabled)
+ return;
+
+ current_time = qemu_get_clock(clock);
+ ptimer_head = &cm_local_active_timers;
+ for (;;) {
+ ts = *ptimer_head;
+ if (!ts || ts->expire_time > current_time)
+ break;
+ /* remove timer from the list before calling the callback */
+ *ptimer_head = ts->next;
+ ts->next = NULL;
+
+ /* run the callback (the timer list can be modified) */
+ ts->cb(ts->opaque);
+ }
+}
+
+static void cm_local_dynticks_rearm_timer(struct qemu_alarm_timer *t)
+{
+ timer_t host_timer = (timer_t)(long)t->priv;
+ struct itimerspec timeout;
+ int64_t nearest_delta_us = INT64_MAX;
+ int64_t current_us;
+
+ assert(alarm_has_dynticks(t));
+ if (!cm_local_active_timers)
+ return;
+
+ nearest_delta_us = cm_local_next_deadline_dyntick();
+
+ /* check whether a timer is already running */
+ if (timer_gettime(host_timer, &timeout)) {
+ perror("gettime");
+ fprintf(stderr, "Internal timer error: aborting\n");
+ exit(1);
+ }
+ current_us =
+ timeout.it_value.tv_sec * 1000000 + timeout.it_value.tv_nsec / 1000;
+ if (current_us && current_us <= nearest_delta_us)
+ return;
+
+ timeout.it_interval.tv_sec = 0;
+ timeout.it_interval.tv_nsec = 0; /* 0 for one-shot timer */
+ timeout.it_value.tv_sec = nearest_delta_us / 1000000;
+ timeout.it_value.tv_nsec = (nearest_delta_us % 1000000) * 1000;
+ if (timer_settime(host_timer, 0 /* RELATIVE */, &timeout, NULL)) {
+ perror("settime");
+ fprintf(stderr, "Internal timer error: aborting\n");
+ exit(1);
+ }
+}
+
+static uint64_t cm_local_next_deadline_dyntick(void)
+{
+ int64_t delta;
+
+ delta = (cm_local_next_deadline() + 999) / 1000;
+
+ if (delta < MIN_TIMER_REARM_US)
+ delta = MIN_TIMER_REARM_US;
+
+ return delta;
+}
+
+static int64_t cm_local_next_deadline(void)
+{
+ /* To avoid problems with overflow limit this to 2^32. */
+ int64_t delta = INT32_MAX;
+
+ if (cm_local_active_timers) {
+ delta = cm_local_active_timers->expire_time - qemu_get_clock(vm_clock);
+ }
+
+ if (delta < 0)
+ delta = 0;
+
+ return delta;
+}
+
+void cm_stop_local_timer(void)
+{
+ cm_local_alarm_timer->stop(cm_local_alarm_timer);
+}
diff --git a/cm-timer.h b/cm-timer.h
new file mode 100644
index 0000000..627819e
--- /dev/null
+++ b/cm-timer.h
@@ -0,0 +1,39 @@
+/*
+ * COREMU Parallel Emulator Framework
+ * The definition of core thread function
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ * Xi Wu <wuxi@fudan.edu.cn>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef CM_TIMER_H
+#define CM_TIMER_H
+
+#include "qemu-common.h"
+int cm_init_local_timer_alarm(void);
+void cm_mod_local_timer(QEMUTimer * ts, int64_t expire_time);
+void cm_del_local_timer(QEMUTimer * ts);
+void cm_run_all_local_timers(void);
+void cm_local_host_alarm_handler(int host_signum);
+int cm_local_alarm_pending(void);
+void cm_init_pit_freq(void);
+void cm_stop_local_timer(void);
+#endif
diff --git a/cpu-all.h b/cpu-all.h
index 52a1817..0110aac 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -22,6 +22,8 @@
#include "qemu-common.h"
#include "cpu-common.h"
+#include "coremu-config.h"
+
/* some important defines:
*
* WORDS_ALIGNED : if defined, the host cpu can only make word aligned
@@ -772,7 +774,7 @@ void cpu_dump_statistics (CPUState *env, FILE *f,
void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...)
__attribute__ ((__format__ (__printf__, 2, 3)));
extern CPUState *first_cpu;
-extern CPUState *cpu_single_env;
+extern COREMU_THREAD CPUState *cpu_single_env;
#define CPU_INTERRUPT_HARD 0x02 /* hardware interrupt pending */
#define CPU_INTERRUPT_EXITTB 0x04 /* exit the current TB (use for x86 a20 case) */
diff --git a/cpu-exec.c b/cpu-exec.c
index dc81e79..086d330 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -22,6 +22,9 @@
#include "tcg.h"
#include "kvm.h"
+#include "coremu-config.h"
+#include "coremu-intr.h"
+
#if !defined(CONFIG_SOFTMMU)
#undef EAX
#undef ECX
@@ -44,7 +47,7 @@
#define env cpu_single_env
#endif
-int tb_invalidated_flag;
+COREMU_THREAD int tb_invalidated_flag;
//#define CONFIG_DEBUG_EXEC
//#define DEBUG_SIGNAL
@@ -224,8 +227,9 @@ int cpu_exec(CPUState *env1)
if (cpu_halted(env1) == EXCP_HALTED)
return EXCP_HALTED;
+#ifndef CONFIG_COREMU
cpu_single_env = env1;
-
+#endif
/* the access to env below is actually saving the global register's
value, so that files not including target-xyz/exec.h are free to
use it. */
@@ -264,6 +268,9 @@ int cpu_exec(CPUState *env1)
/* prepare setjmp context for exception handling */
for(;;) {
if (setjmp(env->jmp_env) == 0) {
+#ifdef CONFIG_COREMU
+ coremu_receive_intr();
+#endif
#if defined(__sparc__) && !defined(CONFIG_SOLARIS)
#undef env
env = cpu_single_env;
@@ -601,7 +608,27 @@ int cpu_exec(CPUState *env1)
env = cpu_single_env;
#define env cpu_single_env
#endif
+
+#ifdef CONFIG_COREMU
+ coremu_receive_intr();
+#endif
next_tb = tcg_qemu_tb_exec(tc_ptr);
+
+#ifdef CONFIG_COREMU
+ coremu_receive_intr();
+
+#ifdef COREMU_CMC_SUPPORT
+ if((next_tb & 3) == 3) {
+ //assert(0);
+ /* this tb has been invalidate */
+ TranslationBlock *tmp_tb = (TranslationBlock *)(next_tb & ~3);
+ next_tb = 0;
+ cpu_pc_from_tb(env, tmp_tb);
+ tb_phys_invalidate(tmp_tb, -1);
+ }
+#endif
+#endif
+
env->current_tb = NULL;
if ((next_tb & 3) == 2) {
/* Instruction counter expired. */
@@ -665,8 +692,10 @@ int cpu_exec(CPUState *env1)
asm("");
env = (void *) saved_env_reg;
+#ifndef CONFIG_COREMU
/* fail safe : never use cpu_single_env outside cpu_exec() */
cpu_single_env = NULL;
+#endif
return ret;
}
diff --git a/cpus.c b/cpus.c
index 29462e5..5bdcd65 100644
--- a/cpus.c
+++ b/cpus.c
@@ -33,6 +33,8 @@
#include "cpus.h"
+#include "coremu-config.h"
+
#ifdef SIGRTMIN
#define SIG_IPI (SIGRTMIN+4)
#else
@@ -269,7 +271,10 @@ void qemu_notify_event(void)
{
CPUState *env = cpu_single_env;
- qemu_event_increment ();
+#ifndef CONFIG_COREMU
+ qemu_event_increment();
+#endif
+
if (env) {
cpu_exit(env);
}
@@ -812,3 +817,11 @@ void list_cpus(FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt, ...),
cpu_list(f, cpu_fprintf); /* deprecated */
#endif
}
+
+#ifdef CONFIG_COREMU
+int cm_cpu_can_run(CPUState * env);
+int cm_cpu_can_run(CPUState * env)
+{
+ return cpu_can_run(env);
+}
+#endif
diff --git a/exec-all.h b/exec-all.h
index 1016de2..50bc79a 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -22,6 +22,8 @@
#include "qemu-common.h"
+#include "coremu-config.h"
+
/* allow to see translation results - the slowdown should be negligible, so we leave it */
#define DEBUG_DISAS
@@ -69,9 +71,9 @@ typedef struct TranslationBlock TranslationBlock;
#define OPPARAM_BUF_SIZE (OPC_BUF_SIZE * MAX_OPC_PARAM)
-extern target_ulong gen_opc_pc[OPC_BUF_SIZE];
-extern uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-extern uint16_t gen_opc_icount[OPC_BUF_SIZE];
+extern COREMU_THREAD target_ulong gen_opc_pc[OPC_BUF_SIZE];
+extern COREMU_THREAD uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
+extern COREMU_THREAD uint16_t gen_opc_icount[OPC_BUF_SIZE];
#include "qemu-log.h"
@@ -162,6 +164,9 @@ struct TranslationBlock {
struct TranslationBlock *jmp_next[2];
struct TranslationBlock *jmp_first;
uint32_t icount;
+#ifdef CONFIG_COREMU
+ uint16_t has_invalidate; /* if this TB has been invalidated */
+#endif
};
static inline unsigned int tb_jmp_cache_hash_page(target_ulong pc)
@@ -191,8 +196,8 @@ void tb_link_page(TranslationBlock *tb,
tb_page_addr_t phys_pc, tb_page_addr_t phys_page2);
void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr);
-extern TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE];
-extern uint8_t *code_gen_ptr;
+extern COREMU_THREAD TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE];
+extern COREMU_THREAD uint8_t *code_gen_ptr;
extern int code_gen_max_blocks;
#if defined(USE_DIRECT_JUMP)
@@ -273,9 +278,9 @@ TranslationBlock *tb_find_pc(unsigned long pc_ptr);
#include "qemu-lock.h"
-extern spinlock_t tb_lock;
+extern COREMU_THREAD spinlock_t tb_lock;
-extern int tb_invalidated_flag;
+extern COREMU_THREAD int tb_invalidated_flag;
#if !defined(CONFIG_USER_ONLY)
diff --git a/exec.c b/exec.c
index 3416aed..4d4064f 100644
--- a/exec.c
+++ b/exec.c
@@ -71,6 +71,13 @@
//#define DEBUG_IOPORT
//#define DEBUG_SUBPAGE
+#include "coremu-config.h"
+#include "coremu-spinlock.h"
+#include "coremu-malloc.h"
+#include "coremu-atomic.h"
+#include "coremu-hw.h"
+#include "cm-tbinval.h"
+
#if !defined(CONFIG_USER_ONLY)
/* TB consistency checks only implemented for usermode emulation. */
#undef DEBUG_TB_CHECK
@@ -78,12 +85,12 @@
#define SMC_BITMAP_USE_THRESHOLD 10
-static TranslationBlock *tbs;
+static COREMU_THREAD TranslationBlock *tbs;
int code_gen_max_blocks;
-TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE];
-static int nb_tbs;
+COREMU_THREAD TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE];
+static COREMU_THREAD int nb_tbs;
/* any access to the tbs or the page table must use this lock */
-spinlock_t tb_lock = SPIN_LOCK_UNLOCKED;
+COREMU_THREAD spinlock_t tb_lock = SPIN_LOCK_UNLOCKED;
#if defined(__arm__) || defined(__sparc_v9__)
/* The prologue must be reachable with a direct jump. ARM and Sparc64
@@ -102,11 +109,11 @@ spinlock_t tb_lock = SPIN_LOCK_UNLOCKED;
#endif
uint8_t code_gen_prologue[1024] code_gen_section;
-static uint8_t *code_gen_buffer;
+static COREMU_THREAD uint8_t *code_gen_buffer;
static unsigned long code_gen_buffer_size;
/* threshold to flush the translated code buffer */
static unsigned long code_gen_buffer_max_size;
-uint8_t *code_gen_ptr;
+COREMU_THREAD uint8_t *code_gen_ptr;
#if !defined(CONFIG_USER_ONLY)
int phys_ram_fd;
@@ -130,7 +137,7 @@ ram_addr_t last_ram_offset;
CPUState *first_cpu;
/* current CPU in the current thread. It is only valid inside
cpu_exec() */
-CPUState *cpu_single_env;
+COREMU_THREAD CPUState *cpu_single_env;
/* 0 = Do not count executed instructions.
1 = Precise instruction counting.
2 = Adaptive rate instruction counting. */
@@ -139,6 +146,16 @@ int use_icount = 0;
include some instructions that have not yet been executed. */
int64_t qemu_icount;
+#ifdef CONFIG_COREMU
+typedef struct PageDesc {
+ /* in order to optimize self modifying code, we count the number
+ of lookups we do to a given page to use a bitmap */
+ unsigned int code_write_count;
+
+ /* Given the different page and tb information for different cpu */
+ CMPageDesc cpu_tbs[COREMU_MAX_CPU];
+} PageDesc;
+#else
typedef struct PageDesc {
/* list of TBs intersecting this ram page */
TranslationBlock *first_tb;
@@ -150,7 +167,7 @@ typedef struct PageDesc {
unsigned long flags;
#endif
} PageDesc;
-
+#endif
/* In system mode we want L1_MAP to be based on ram offsets,
while in user mode we want it to be based on virtual addresses. */
#if !defined(CONFIG_USER_ONLY)
@@ -237,7 +254,7 @@ static int log_append = 0;
static int tlb_flush_count;
#endif
static int tb_flush_count;
-static int tb_phys_invalidate_count;
+static COREMU_THREAD int tb_phys_invalidate_count;
#ifdef _WIN32
static void map_exec(void *addr, long size)
@@ -383,8 +400,13 @@ static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc)
if (!alloc) {
return NULL;
}
+#ifdef CONFIG_COREMU
+ coremu_atomic_mallocz(lp, sizeof(void *) * L2_SIZE);
+ p = *lp;
+#else
ALLOC(p, sizeof(void *) * L2_SIZE);
*lp = p;
+#endif
}
lp = p + ((index >> (i * L2_BITS)) & (L2_SIZE - 1));
@@ -395,8 +417,13 @@ static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc)
if (!alloc) {
return NULL;
}
+#ifdef CONFIG_COREMU
+ coremu_atomic_mallocz(lp, sizeof(PageDesc) * L2_SIZE);
+ pd = *lp;
+#else
ALLOC(pd, sizeof(PageDesc) * L2_SIZE);
*lp = pd;
+#endif
}
#undef ALLOC
@@ -426,7 +453,12 @@ static PhysPageDesc *phys_page_find_alloc(target_phys_addr_t index, int alloc)
if (!alloc) {
return NULL;
}
+#ifdef CONFIG_COREMU
+ coremu_atomic_mallocz(lp, sizeof(void *) * L2_SIZE);
+ p = *lp;
+#else
*lp = p = qemu_mallocz(sizeof(void *) * L2_SIZE);
+#endif
}
lp = p + ((index >> (i * L2_BITS)) & (L2_SIZE - 1));
}
@@ -438,9 +470,12 @@ static PhysPageDesc *phys_page_find_alloc(target_phys_addr_t index, int alloc)
if (!alloc) {
return NULL;
}
-
+#ifdef CONFIG_COREMU
+ coremu_atomic_mallocz(lp, sizeof(PhysPageDesc) * L2_SIZE);
+ pd = *lp;
+#else
*lp = pd = qemu_malloc(sizeof(PhysPageDesc) * L2_SIZE);
-
+#endif
for (i = 0; i < L2_SIZE; i++) {
pd[i].phys_offset = IO_MEM_UNASSIGNED;
pd[i].region_offset = (index + i) << TARGET_PAGE_BITS;
@@ -649,10 +684,19 @@ void cpu_exec_init(CPUState *env)
static inline void invalidate_page_bitmap(PageDesc *p)
{
+#ifdef CONFIG_COREMU
+#if defined(TARGET_I386)
+ int cpuid = cpu_single_env->cpuid_apic_id;
+#elif defined(TARGET_ARM)
+ int cpuid = cpu_single_env->cpu_index;
+#endif
+ cm_invalidate_bitmap(&p->cpu_tbs[cpuid]);
+#else
if (p->code_bitmap) {
qemu_free(p->code_bitmap);
p->code_bitmap = NULL;
}
+#endif
p->code_write_count = 0;
}
@@ -665,10 +709,20 @@ static void page_flush_tb_1 (int level, void **lp)
if (*lp == NULL) {
return;
}
+#if defined(TARGET_I386)
+ int cpuid = cpu_single_env->cpuid_apic_id;
+#elif defined(TARGET_ARM)
+ int cpuid = cpu_single_env->cpu_index;
+#endif
if (level == 0) {
PageDesc *pd = *lp;
for (i = 0; i < L2_SIZE; ++i) {
+#ifdef CONFIG_COREMU
+ /* XXX only flush tb for the corresponding cpu. */
+ pd[i].cpu_tbs[cpuid].first_tb = NULL;
+#else
pd[i].first_tb = NULL;
+#endif
invalidate_page_bitmap(pd + i);
}
} else {
@@ -691,7 +745,9 @@ static void page_flush_tb(void)
/* XXX: tb_flush is currently not thread safe */
void tb_flush(CPUState *env1)
{
+#ifndef CONFIG_COREMU
CPUState *env;
+#endif
#if defined(DEBUG_FLUSH)
printf("qemu: flush code_size=%ld nb_tbs=%d avg_tb_size=%ld\n",
(unsigned long)(code_gen_ptr - code_gen_buffer),
@@ -702,10 +758,13 @@ void tb_flush(CPUState *env1)
cpu_abort(env1, "Internal error: code buffer overflow\n");
nb_tbs = 0;
-
+#ifdef CONFIG_COREMU
+ memset (env1->tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *));
+#else
for(env = first_cpu; env != NULL; env = env->next_cpu) {
memset (env->tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *));
}
+#endif
memset (tb_phys_hash, 0, CODE_GEN_PHYS_HASH_SIZE * sizeof (void *));
page_flush_tb();
@@ -829,7 +888,14 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
unsigned int h, n1;
tb_page_addr_t phys_pc;
TranslationBlock *tb1, *tb2;
-
+#ifdef CONFIG_COREMU
+#if defined(TARGET_I386)
+ int cpuid = cpu_single_env->cpuid_apic_id;
+#elif defined(TARGET_ARM)
+ int cpuid = cpu_single_env->cpu_index;
+#endif
+ CMPageDesc *cp;
+#endif
/* remove the TB from the hash list */
phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
h = tb_phys_hash_func(phys_pc);
@@ -839,12 +905,26 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
/* remove the TB from the page list */
if (tb->page_addr[0] != page_addr) {
p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
+#ifdef CONFIG_COREMU
+ cp = &p->cpu_tbs[cpuid];
+ coremu_spin_lock(&cp->tb_list_lock);
+ tb_page_remove(&cp->first_tb, tb);
+ coremu_spin_unlock(&cp->tb_list_lock);
+#else
tb_page_remove(&p->first_tb, tb);
+#endif
invalidate_page_bitmap(p);
}
if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
+#ifdef CONFIG_COREMU
+ cp = &p->cpu_tbs[cpuid];
+ coremu_spin_lock(&cp->tb_list_lock);
+ tb_page_remove(&cp->first_tb, tb);
+ coremu_spin_unlock(&cp->tb_list_lock);
+#else
tb_page_remove(&p->first_tb, tb);
+#endif
invalidate_page_bitmap(p);
}
@@ -852,10 +932,17 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
/* remove the TB from the hash list */
h = tb_jmp_cache_hash_func(tb->pc);
+
+#ifdef CONFIG_COREMU
+ env = cpu_single_env;
+ if (env->tb_jmp_cache[h] == tb)
+ env->tb_jmp_cache[h] = NULL;
+#else
for(env = first_cpu; env != NULL; env = env->next_cpu) {
if (env->tb_jmp_cache[h] == tb)
env->tb_jmp_cache[h] = NULL;
}
+#endif
/* suppress this TB from the two jump lists */
tb_jmp_remove(tb, 0);
@@ -909,10 +996,20 @@ static void build_page_bitmap(PageDesc *p)
{
int n, tb_start, tb_end;
TranslationBlock *tb;
-
+#ifdef CONFIG_COREMU
+#if defined(TARGET_I386)
+ int cpuid = cpu_single_env->cpuid_apic_id;
+#elif defined(TARGET_ARM)
+ int cpuid = cpu_single_env->cpu_index;
+#endif
+ CMPageDesc *cp = &p->cpu_tbs[cpuid];
+ coremu_spin_lock(&cp->bitmap_lock);
+ cp->code_bitmap = coremu_mallocz(TARGET_PAGE_SIZE / 8);
+ tb = cp->first_tb;
+#else
p->code_bitmap = qemu_mallocz(TARGET_PAGE_SIZE / 8);
-
tb = p->first_tb;
+#endif
while (tb != NULL) {
n = (long)tb & 3;
tb = (TranslationBlock *)((long)tb & ~3);
@@ -928,9 +1025,16 @@ static void build_page_bitmap(PageDesc *p)
tb_start = 0;
tb_end = ((tb->pc + tb->size) & ~TARGET_PAGE_MASK);
}
+#ifdef CONFIG_COREMU
+ set_bits(cp->code_bitmap, tb_start, tb_end - tb_start);
+#else
set_bits(p->code_bitmap, tb_start, tb_end - tb_start);
+#endif
tb = tb->page_next[n];
}
+#ifdef CONFIG_COREMU
+ coremu_spin_unlock(&cp->bitmap_lock);
+#endif
}
TranslationBlock *tb_gen_code(CPUState *env,
@@ -996,6 +1100,22 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
p = page_find(start >> TARGET_PAGE_BITS);
if (!p)
return;
+#ifdef CONFIG_COREMU
+#if defined(TARGET_I386)
+ int cpuid = cpu_single_env->cpuid_apic_id;
+#elif defined(TARGET_ARM)
+ int cpuid = cpu_single_env->cpu_index;
+#endif
+ atomic_incl((uint32_t *)&p->code_write_count);
+ if (!p->cpu_tbs[cpuid].code_bitmap &&
+ p->code_write_count >= SMC_BITMAP_USE_THRESHOLD &&
+ is_cpu_write_access) {
+ /* build code bitmap */
+ build_page_bitmap(p);
+ }
+
+ tb = p->cpu_tbs[cpuid].first_tb;
+#else
if (!p->code_bitmap &&
++p->code_write_count >= SMC_BITMAP_USE_THRESHOLD &&
is_cpu_write_access) {
@@ -1006,6 +1126,8 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
/* we remove all the TBs in the range [start, end[ */
/* XXX: see if in some cases it could be faster to invalidate all the code */
tb = p->first_tb;
+#endif
+
while (tb != NULL) {
n = (long)tb & 3;
tb = (TranslationBlock *)((long)tb & ~3);
@@ -1052,6 +1174,9 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
saved_tb = env->current_tb;
env->current_tb = NULL;
}
+#ifdef CONFIG_COREMU
+ tb->has_invalidate = 1;
+#endif
tb_phys_invalidate(tb, -1);
if (env) {
env->current_tb = saved_tb;
@@ -1063,6 +1188,15 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
}
#if !defined(CONFIG_USER_ONLY)
/* if no code remaining, no need to continue to use slow writes */
+#ifdef CONFIG_COREMU
+ if (!p->cpu_tbs[cpuid].first_tb) {
+ invalidate_page_bitmap(p);
+ cm_phys_del_tb(start);
+ if ((!cm_phys_page_tb_p(start)) && is_cpu_write_access) {
+ tlb_unprotect_code_phys(env, start, env->mem_io_vaddr);
+ }
+ }
+#else
if (!p->first_tb) {
invalidate_page_bitmap(p);
if (is_cpu_write_access) {
@@ -1070,6 +1204,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
}
}
#endif
+#endif
#ifdef TARGET_HAS_PRECISE_SMC
if (current_tb_modified) {
/* we generate a block containing just the instruction
@@ -1098,9 +1233,20 @@ static inline void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
p = page_find(start >> TARGET_PAGE_BITS);
if (!p)
return;
+#ifdef CONFIG_COREMU
+#if defined(TARGET_I386)
+ int cpuid = cpu_single_env->cpuid_apic_id;
+#elif defined(TARGET_ARM)
+ int cpuid = cpu_single_env->cpu_index;
+#endif
+ if (p->cpu_tbs[cpuid].code_bitmap) {
+ offset = start & ~TARGET_PAGE_MASK;
+ b = p->cpu_tbs[cpuid].code_bitmap[offset >> 3] >> (offset & 7);
+#else
if (p->code_bitmap) {
offset = start & ~TARGET_PAGE_MASK;
b = p->code_bitmap[offset >> 3] >> (offset & 7);
+#endif
if (b & ((1 << len) - 1))
goto do_invalidate;
} else {
@@ -1179,9 +1325,21 @@ static inline void tb_alloc_page(TranslationBlock *tb,
tb->page_addr[n] = page_addr;
p = page_find_alloc(page_addr >> TARGET_PAGE_BITS, 1);
+#ifdef CONFIG_COREMU
+ assert(cpu_single_env);
+#if defined(TARGET_I386)
+ int cpuid = cpu_single_env->cpuid_apic_id;
+#elif defined(TARGET_ARM)
+ int cpuid = cpu_single_env->cpu_index;
+#endif
+ tb->page_next[n] = p->cpu_tbs[cpuid].first_tb;
+ last_first_tb = p->cpu_tbs[cpuid].first_tb;
+ p->cpu_tbs[cpuid].first_tb = (TranslationBlock *)((long)tb | n);
+#else
tb->page_next[n] = p->first_tb;
last_first_tb = p->first_tb;
p->first_tb = (TranslationBlock *)((long)tb | n);
+#endif
invalidate_page_bitmap(p);
#if defined(TARGET_HAS_SMC) || 1
@@ -1217,7 +1375,13 @@ static inline void tb_alloc_page(TranslationBlock *tb,
protected. So we handle the case where only the first TB is
allocated in a physical page */
if (!last_first_tb) {
+#ifdef CONFIG_COREMU
+ cm_phys_add_tb(page_addr);
+ if(cm_phys_page_tb_p(page_addr) == 1)
+ tlb_protect_code(page_addr);
+#else
tlb_protect_code(page_addr);
+#endif
}
#endif
@@ -1390,7 +1554,11 @@ static void breakpoint_invalidate(CPUState *env, target_ulong pc)
pd = p->phys_offset;
}
ram_addr = (pd & TARGET_PAGE_MASK) | (pc & ~TARGET_PAGE_MASK);
+#ifdef CONFIG_COREMU
+ cm_invalidate_tb(ram_addr, 1);
+#else
tb_invalidate_phys_page_range(ram_addr, ram_addr + 1, 0);
+#endif
}
#endif
#endif /* TARGET_HAS_ICE */
@@ -1612,7 +1780,7 @@ static void cpu_unlink_tb(CPUState *env)
emulation this often isn't actually as bad as it sounds. Often
signals are used primarily to interrupt blocking syscalls. */
TranslationBlock *tb;
- static spinlock_t interrupt_lock = SPIN_LOCK_UNLOCKED;
+ static COREMU_THREAD spinlock_t interrupt_lock = SPIN_LOCK_UNLOCKED;
spin_lock(&interrupt_lock);
tb = env->current_tb;
@@ -1936,7 +2104,14 @@ void tlb_flush(CPUState *env, int flush_global)
for(i = 0; i < CPU_TLB_SIZE; i++) {
int mmu_idx;
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
+#ifdef CONFIG_COREMU
+ /* XXX: temporay solution to the tlb lookup data race problem */
+ env->tlb_table[mmu_idx][i].addr_read = -1;
+ env->tlb_table[mmu_idx][i].addr_write = -1;
+ env->tlb_table[mmu_idx][i].addr_code = -1;
+#else
env->tlb_table[mmu_idx][i] = s_cputlb_empty_entry;
+#endif
}
}
@@ -2048,8 +2223,13 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
int mmu_idx;
for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
for(i = 0; i < CPU_TLB_SIZE; i++)
+#ifdef CONFIG_COREMU
+ cm_tlb_reset_dirty_range(&env->tlb_table[mmu_idx][i],
+ start1, length);
+#else
tlb_reset_dirty_range(&env->tlb_table[mmu_idx][i],
start1, length);
+#endif
}
}
}
@@ -2639,7 +2819,17 @@ void cpu_register_physical_memory_offset(target_phys_addr_t start_addr,
reset the modified entries */
/* XXX: slow ! */
for(env = first_cpu; env != NULL; env = env->next_cpu) {
+ /* If there is no hot plug device this function won't be invoked
+ after pci bus initialized, so we don't enable broadcast flush
+ tlb in common case. */
+#if defined(CONFIG_COREMU) && defined(COREMU_FLUSH_TLB)
+ if(coremu_init_done_p())
+ cm_send_tlb_flush_req(env->cpuid_apic_id);
+ else
+ tlb_flush(env, 1);
+#else
tlb_flush(env, 1);
+#endif
}
}
@@ -2805,6 +2995,10 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size)
memset(phys_ram_dirty + (last_ram_offset >> TARGET_PAGE_BITS),
0xff, size >> TARGET_PAGE_BITS);
+#ifdef CONFIG_COREMU
+ coremu_assert_hw_thr("qemu_ram_alloc should only called by hw thr");
+ cm_init_tb_cnt(last_ram_offset, size);
+#endif
last_ram_offset += size;
if (kvm_enabled())
@@ -2847,11 +3041,16 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
abort();
}
/* Move this entry to to start of the list. */
+#ifndef CONFIG_COREMU
+ /* Different core can access this function at the same time.
+ * For coremu, disable this optimization to avoid data race.
+ * XXX or use spin lock here if performance impact is big. */
if (prev) {
prev->next = block->next;
block->next = *prevp;
*prevp = block;
}
+#endif
return block->host + (addr - block->offset);
}
@@ -2956,7 +3155,11 @@ static void notdirty_mem_writeb(void *opaque, target_phys_addr_t ram_addr,
dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
if (!(dirty_flags & CODE_DIRTY_FLAG)) {
#if !defined(CONFIG_USER_ONLY)
+#ifdef CONFIG_COREMU
+ cm_invalidate_tb(ram_addr, 1);
+#else
tb_invalidate_phys_page_fast(ram_addr, 1);
+#endif
dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
#endif
}
@@ -2976,7 +3179,11 @@ static void notdirty_mem_writew(void *opaque, target_phys_addr_t ram_addr,
dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
if (!(dirty_flags & CODE_DIRTY_FLAG)) {
#if !defined(CONFIG_USER_ONLY)
+#ifdef CONFIG_COREMU
+ cm_invalidate_tb(ram_addr, 2);
+#else
tb_invalidate_phys_page_fast(ram_addr, 2);
+#endif
dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
#endif
}
@@ -2996,7 +3203,11 @@ static void notdirty_mem_writel(void *opaque, target_phys_addr_t ram_addr,
dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
if (!(dirty_flags & CODE_DIRTY_FLAG)) {
#if !defined(CONFIG_USER_ONLY)
+#ifdef CONFIG_COREMU
+ cm_invalidate_tb(ram_addr, 4);
+#else
tb_invalidate_phys_page_fast(ram_addr, 4);
+#endif
dirty_flags = cpu_physical_memory_get_dirty_flags(ram_addr);
#endif
}
@@ -3419,7 +3630,11 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
memcpy(ptr, buf, l);
if (!cpu_physical_memory_is_dirty(addr1)) {
/* invalidate code */
+#ifdef CONFIG_COREMU
+ cm_invalidate_tb(addr1, l);
+#else
tb_invalidate_phys_page_range(addr1, addr1 + l, 0);
+#endif
/* set dirty bit */
cpu_physical_memory_set_dirty_flags(
addr1, (0xff & ~CODE_DIRTY_FLAG));
@@ -3626,7 +3841,11 @@ void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
l = access_len;
if (!cpu_physical_memory_is_dirty(addr1)) {
/* invalidate code */
+#ifdef CONFIG_COREMU
+ cm_invalidate_tb(addr1, l);
+#else
tb_invalidate_phys_page_range(addr1, addr1 + l, 0);
+#endif
/* set dirty bit */
cpu_physical_memory_set_dirty_flags(
addr1, (0xff & ~CODE_DIRTY_FLAG));
@@ -3785,7 +4004,11 @@ void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val)
if (unlikely(in_migration)) {
if (!cpu_physical_memory_is_dirty(addr1)) {
/* invalidate code */
+ #ifdef CONFIG_COREMU
+ cm_invalidate_tb(addr1, 4);
+ #else
tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
+ #endif
/* set dirty bit */
cpu_physical_memory_set_dirty_flags(
addr1, (0xff & ~CODE_DIRTY_FLAG));
@@ -3854,7 +4077,11 @@ void stl_phys(target_phys_addr_t addr, uint32_t val)
stl_p(ptr, val);
if (!cpu_physical_memory_is_dirty(addr1)) {
/* invalidate code */
+ #ifdef CONFIG_COREMU
+ cm_invalidate_tb(addr1, 4);
+ #else
tb_invalidate_phys_page_range(addr1, addr1 + 4, 0);
+ #endif
/* set dirty bit */
cpu_physical_memory_set_dirty_flags(addr1,
(0xff & ~CODE_DIRTY_FLAG));
@@ -4076,3 +4303,8 @@ void dump_exec_info(FILE *f,
#undef env
#endif
+
+#ifdef CONFIG_COREMU
+#include "cm-init.c"
+#include "cm-tbinval.c"
+#endif
diff --git a/hw/apic.c b/hw/apic.c
old mode 100644
new mode 100755
index 9029dad..3d64331
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -26,6 +26,9 @@
#include "kvm.h"
//#define DEBUG_APIC
+#include "coremu-config.h"
+#include "cm-target-intr.h"
+#include "cm-timer.h"
/* APIC Local Vector Table */
#define APIC_LVT_TIMER 0
@@ -244,7 +247,11 @@ static void apic_bus_deliver(const uint32_t *deliver_bitmask,
if (d >= 0) {
apic_iter = local_apics[d];
if (apic_iter) {
+#ifdef CONFIG_COREMU
+ cm_send_apicbus_intr(apic_iter->id, CPU_INTERRUPT_HARD, vector_num, trigger_mode);
+#else
apic_set_irq(apic_iter, vector_num, trigger_mode);
+#endif
}
}
}
@@ -254,19 +261,37 @@ static void apic_bus_deliver(const uint32_t *deliver_bitmask,
break;
case APIC_DM_SMI:
+#ifdef CONFIG_COREMU
+ /* Vector number is -1 which indecates ignore */
+ foreach_apic(apic_iter, deliver_bitmask,
+ cm_send_apicbus_intr(apic_iter->id, CPU_INTERRUPT_SMI, -1, -1) );
+#else
foreach_apic(apic_iter, deliver_bitmask,
cpu_interrupt(apic_iter->cpu_env, CPU_INTERRUPT_SMI) );
+#endif
return;
case APIC_DM_NMI:
+#ifdef CONFIG_COREMU
+ /* Vector number is -1 which indecates ignore */
+ foreach_apic(apic_iter, deliver_bitmask,
+ cm_send_apicbus_intr(apic_iter->id, CPU_INTERRUPT_NMI, -1, -1) );
+#else
foreach_apic(apic_iter, deliver_bitmask,
cpu_interrupt(apic_iter->cpu_env, CPU_INTERRUPT_NMI) );
+#endif
return;
case APIC_DM_INIT:
/* normal INIT IPI sent to processors */
+#ifdef CONFIG_COREMU
+ /* Vector number is -1 which indecates ignore */
+ foreach_apic(apic_iter, deliver_bitmask,
+ cm_send_apicbus_intr(apic_iter->id, CPU_INTERRUPT_INIT, -1, -1) );
+#else
foreach_apic(apic_iter, deliver_bitmask,
cpu_interrupt(apic_iter->cpu_env, CPU_INTERRUPT_INIT) );
+#endif
return;
case APIC_DM_EXTINT:
@@ -277,8 +302,14 @@ static void apic_bus_deliver(const uint32_t *deliver_bitmask,
return;
}
+#ifdef CONFIG_COREMU
+ /* Vector number is -1 which indecates ignore */
+ foreach_apic(apic_iter, deliver_bitmask,
+ cm_send_apicbus_intr(apic_iter->id, CPU_INTERRUPT_HARD, vector_num, trigger_mode) );
+#else
foreach_apic(apic_iter, deliver_bitmask,
apic_set_irq(apic_iter, vector_num, trigger_mode) );
+#endif
}
void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
@@ -553,16 +584,26 @@ static void apic_deliver(APICState *s, uint8_t dest, uint8_t dest_mode,
int trig_mode = (s->icr[0] >> 15) & 1;
int level = (s->icr[0] >> 14) & 1;
if (level == 0 && trig_mode == 1) {
+#ifdef CONFIG_COREMU
+ foreach_apic(apic_iter, deliver_bitmask,
+ cm_send_ipi_intr(apic_iter->id, vector_num, 0));
+#else
foreach_apic(apic_iter, deliver_bitmask,
apic_iter->arb_id = apic_iter->id );
+#endif
return;
}
}
break;
case APIC_DM_SIPI:
+#ifdef CONFIG_COREMU
+ foreach_apic(apic_iter, deliver_bitmask,
+ cm_send_ipi_intr(apic_iter->id, vector_num, 1));
+#else
foreach_apic(apic_iter, deliver_bitmask,
apic_startup(apic_iter, vector_num) );
+#endif
return;
}
@@ -646,11 +687,19 @@ static void apic_timer_update(APICState *s, int64_t current_time)
d = (uint64_t)s->initial_count + 1;
}
next_time = s->initial_count_load_time + (d << s->count_shift);
+#ifdef CONFIG_COREMU
+ cm_mod_local_timer(s->timer, next_time);
+#else
qemu_mod_timer(s->timer, next_time);
+#endif
s->next_time = next_time;
} else {
no_timer:
+#ifdef CONFIG_COREMU
+ cm_del_local_timer(s->timer);
+#else
qemu_del_timer(s->timer);
+#endif
}
}
@@ -1000,3 +1049,26 @@ int apic_init(CPUState *env)
local_apics[s->idx] = s;
return 0;
}
+
+/*
+ * COREMU Parallel Emulator Framework
+ * The wrapper for COREMU IO emulate mechanism.
+ *
+ * Copyright (C) 2010 PPI, Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ */
+/* The declaration for wrapper interface */
+void cm_apic_set_irq(struct APICState *s, int vector_num, int trigger_mode)
+{
+ apic_set_irq(s, vector_num, trigger_mode);
+}
+
+void cm_apic_startup(struct APICState *s, int vector_num)
+{
+ apic_startup(s, vector_num);
+}
+
+void cm_apic_setup_arbid(struct APICState *s)
+{
+ s->arb_id = s->id;
+}
diff --git a/hw/arm_gic.c b/hw/arm_gic.c
old mode 100644
new mode 100755
index c4afc6a..1594213
--- a/hw/arm_gic.c
+++ b/hw/arm_gic.c
@@ -12,6 +12,10 @@
Nested Vectored Interrupt Controller. */
//#define DEBUG_GIC
+#include "coremu-config.h"
+#include "coremu-spinlock.h"
+#include "cm-target-intr.h"
+#include "coremu-hw.h"
#ifdef DEBUG_GIC
#define DPRINTF(fmt, ...) \
@@ -151,24 +155,46 @@ static void __attribute__((unused))
gic_set_pending_private(gic_state *s, int cpu, int irq)
{
int cm = 1 << cpu;
-
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_lock(&cm_hw_lock);
+#endif
if (GIC_TEST_PENDING(irq, cm))
+ {
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_unlock(&cm_hw_lock);
+#endif
return;
-
+ }
DPRINTF("Set %d pending cpu %d\n", irq, cpu);
GIC_SET_PENDING(irq, cm);
gic_update(s);
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_unlock(&cm_hw_lock);
+#endif
}
/* Process a change in an external IRQ input. */
static void gic_set_irq(void *opaque, int irq, int level)
{
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_lock(&cm_hw_lock);
+#endif
gic_state *s = (gic_state *)opaque;
/* The first external input line is internal interrupt 32. */
irq += 32;
- if (level == GIC_TEST_LEVEL(irq, ALL_CPU_MASK))
+ if (level == GIC_TEST_LEVEL(irq, ALL_CPU_MASK)) {
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_unlock(&cm_hw_lock);
+#endif
return;
+ }
+
if (level) {
GIC_SET_LEVEL(irq, ALL_CPU_MASK);
if (GIC_TEST_TRIGGER(irq) || GIC_TEST_ENABLED(irq)) {
@@ -179,6 +205,10 @@ static void gic_set_irq(void *opaque, int irq, int level)
GIC_CLEAR_LEVEL(irq, ALL_CPU_MASK);
}
gic_update(s);
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_unlock(&cm_hw_lock);
+#endif
}
static void gic_set_running_irq(gic_state *s, int cpu, int irq)
@@ -194,6 +224,10 @@ static void gic_set_running_irq(gic_state *s, int cpu, int irq)
static uint32_t gic_acknowledge_irq(gic_state *s, int cpu)
{
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_lock(&cm_hw_lock);
+#endif
int new_irq;
int cm = 1 << cpu;
new_irq = s->current_pending[cpu];
@@ -208,11 +242,19 @@ static uint32_t gic_acknowledge_irq(gic_state *s, int cpu)
GIC_CLEAR_PENDING(new_irq, GIC_TEST_MODEL(new_irq) ? ALL_CPU_MASK : cm);
gic_set_running_irq(s, cpu, new_irq);
DPRINTF("ACK %d\n", new_irq);
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_unlock(&cm_hw_lock);
+#endif
return new_irq;
}
static void gic_complete_irq(gic_state * s, int cpu, int irq)
{
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_lock(&cm_hw_lock);
+#endif
int update = 0;
int cm = 1 << cpu;
DPRINTF("EOI %d\n", irq);
@@ -245,6 +287,10 @@ static void gic_complete_irq(gic_state * s, int cpu, int irq)
/* Complete the current running IRQ. */
gic_set_running_irq(s, cpu, s->last_active[s->running_irq[cpu]][cpu]);
}
+#ifdef CONFIG_COREMU
+ if(coremu_hw_thr_p())
+ coremu_spin_unlock(&cm_hw_lock);
+#endif
}
static uint32_t gic_dist_readb(void *opaque, target_phys_addr_t offset)
diff --git a/hw/arm_pic.c b/hw/arm_pic.c
old mode 100644
new mode 100755
index f44568c..3aa6016
--- a/hw/arm_pic.c
+++ b/hw/arm_pic.c
@@ -11,6 +11,9 @@
#include "pc.h"
#include "arm-misc.h"
+#include "coremu-config.h"
+#include "cm-target-intr.h"
+
/* Stub functions for hardware that doesn't exist. */
void pic_info(Monitor *mon)
{
@@ -45,5 +48,9 @@ static void arm_pic_cpu_handler(void *opaque, int irq, int level)
qemu_irq *arm_pic_init_cpu(CPUState *env)
{
+#ifdef CONFIG_COREMU
return qemu_allocate_irqs(arm_pic_cpu_handler, env, 2);
+#else
+ return qemu_allocate_irqs(cm_arm_pic_cpu_handler, env, 2);
+#endif
}
diff --git a/hw/i8259.c b/hw/i8259.c
old mode 100644
new mode 100755
index ea48e0e..3943ec7
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -32,6 +32,7 @@
//#define DEBUG_IRQ_LATENCY
//#define DEBUG_IRQ_COUNT
+#include "coremu-config.h"
typedef struct PicState {
uint8_t last_irr; /* edge detection */
@@ -245,7 +246,13 @@ int pic_read_irq(PicState2 *s)
irq = 7;
intno = s->pics[0].irq_base + irq;
}
+#ifndef CONFIG_COREMU
+ /* COREMU XXX: in parallel emulation, we always use real-time signals to
+ * inform the emulator about interrupts, there is no need for such update by
+ * emulator itself.
+ * ??? more check on this ??? */
pic_update_irq(s);
+#endif
#ifdef DEBUG_IRQ_LATENCY
printf("IRQ%d latency=%0.3fus\n",
diff --git a/hw/ide/core.c b/hw/ide/core.c
old mode 100644
new mode 100755
index 0757528..ad46a3a
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -582,7 +582,21 @@ static void ide_read_dma_cb(void *opaque, int ret)
/* end of transfer ? */
if (s->nsector == 0) {
s->status = READY_STAT | SEEK_STAT;
+
+/* For coremu dma state need to be changed before irq is sent */
+#ifdef CONFIG_COREMU
+ bm->status &= ~BM_STATUS_DMAING;
+ bm->status |= BM_STATUS_INT;
+ bm->dma_cb = NULL;
+ bm->unit = -1;
+ bm->aiocb = NULL;
+#endif
ide_set_irq(s->bus);
+
+#ifdef CONFIG_COREMU
+ return;
+#endif
+
eot:
bm->status &= ~BM_STATUS_DMAING;
bm->status |= BM_STATUS_INT;
@@ -726,7 +740,21 @@ static void ide_write_dma_cb(void *opaque, int ret)
/* end of transfer ? */
if (s->nsector == 0) {
s->status = READY_STAT | SEEK_STAT;
+
+/* For coremu dma state need to be changed before irq is sent */
+#ifdef CONFIG_COREMU
+ bm->status &= ~BM_STATUS_DMAING;
+ bm->status |= BM_STATUS_INT;
+ bm->dma_cb = NULL;
+ bm->unit = -1;
+ bm->aiocb = NULL;
+#endif
ide_set_irq(s->bus);
+
+#ifdef CONFIG_COREMU
+ return;
+#endif
+
eot:
bm->status &= ~BM_STATUS_DMAING;
bm->status |= BM_STATUS_INT;
diff --git a/hw/ioapic.c b/hw/ioapic.c
index 7ad8018..0cbeac3 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -179,6 +179,14 @@ static void ioapic_mem_writel(void *opaque, target_phys_addr_t addr, uint32_t va
default:
index = (s->ioregsel - 0x10) >> 1;
if (index >= 0 && index < IOAPIC_NUM_PINS) {
+#ifdef CONFIG_COREMU
+ /* Qemu's code will cause data race: when ioapic_service
+ * reads the table entry, and the read happens between the
+ * two assignment, it may get a zero entry.
+ * In fact, we just need to assign to the high or low 32 bits of
+ * the table entry according to ioregsel. */
+ *((uint32_t *)(s->ioredtbl + index) + (s->ioregsel & 1)) = val;
+#else
if (s->ioregsel & 1) {
s->ioredtbl[index] &= 0xffffffff;
s->ioredtbl[index] |= (uint64_t)val << 32;
@@ -186,6 +194,7 @@ static void ioapic_mem_writel(void *opaque, target_phys_addr_t addr, uint32_t va
s->ioredtbl[index] &= ~0xffffffffULL;
s->ioredtbl[index] |= val;
}
+#endif
ioapic_service(s);
}
}
diff --git a/hw/pc.c b/hw/pc.c
old mode 100644
new mode 100755
index db2b9a2..1340916
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -50,6 +50,10 @@
/* output Bochs bios info messages */
//#define DEBUG_BIOS
+#include "coremu-config.h"
+#include "coremu-init.h"
+#include "coremu-core.h"
+#include "cm-target-intr.h"
#define BIOS_FILENAME "bios.bin"
@@ -141,7 +145,9 @@ int cpu_get_pic_interrupt(CPUState *env)
if (intno >= 0) {
/* set irq request if a PIC irq is still pending */
/* XXX: improve that */
+#ifndef CONFIG_COREMU
pic_update_irq(isa_pic);
+#endif
return intno;
}
/* read the irq from the PIC */
@@ -795,6 +801,9 @@ static CPUState *pc_new_cpu(const char *cpu_model)
} else {
qemu_register_reset((QEMUResetHandler*)cpu_reset, env);
}
+#ifdef CONFIG_COREMU
+ coremu_core_init(env->cpuid_apic_id, env);
+#endif
return env;
}
@@ -916,8 +925,11 @@ static void pc_init1(ram_addr_t ram_size,
for (i = 0; i < nb_option_roms; i++) {
rom_add_option(option_rom[i]);
}
-
+#ifdef CONFIG_COREMU
+ cpu_irq = qemu_allocate_irqs(cm_pic_irq_request, NULL, 1);
+#else
cpu_irq = qemu_allocate_irqs(pic_irq_request, NULL, 1);
+#endif
i8259 = i8259_init(cpu_irq[0]);
isa_irq_state = qemu_mallocz(sizeof(*isa_irq_state));
isa_irq_state->i8259 = i8259;
@@ -1217,3 +1229,37 @@ static void pc_machine_init(void)
}
machine_init(pc_machine_init);
+
+/*
+ * COREMU Parallel Emulator Framework
+ * The wrapper for COREMU IO emulate mechanism
+ *
+ * Copyright (C) 2010 PPI, Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ */
+#ifdef CONFIG_COREMU
+/* The pic irq request */
+void cm_pic_irq_request(void * opaque, int irq, int level)
+{
+ CPUState *env = NULL;
+
+ if (coremu_init_done_p()) {
+ /* Send the signal to core thread */
+ env = first_cpu;
+ if (env->apic_state) {
+ while (env) {
+ if (apic_accept_pic_intr(env)) {
+ cm_send_pic_intr(env->cpuid_apic_id, level);
+ }
+ env = env->next_cpu;
+ }
+ } else {
+ /* Uniprocessor system without lapic */
+ cm_send_pic_intr(env->cpuid_apic_id, level);
+ }
+ } else {
+ /* Initialization hasn't finished */
+ pic_irq_request(opaque, irq, level);
+ }
+}
+#endif
diff --git a/hw/pc.h b/hw/pc.h
index d11a576..5a954fb 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -3,6 +3,8 @@
#include "qemu-common.h"
#include "ioport.h"
+#include "coremu-config.h"
+#include "coremu-sched.h"
/* PC-style peripherals (also used by other machines). */
@@ -37,8 +39,16 @@ void pic_info(Monitor *mon);
void irq_info(Monitor *mon);
/* i8254.c */
-
+#ifdef CONFIG_COREMU
+extern int cm_pit_freq;
+/* *
+ * For parallel emualtion, timer frequency need to be reduced when
+ * more than one thread runs on a simple physical cores
+ */
+#define PIT_FREQ cm_pit_freq
+#else
#define PIT_FREQ 1193182
+#endif
typedef struct PITState PITState;
diff --git a/posix-aio-compat.c b/posix-aio-compat.c
index b43c531..392cce5 100644
--- a/posix-aio-compat.c
+++ b/posix-aio-compat.c
@@ -29,6 +29,10 @@
#include "block/raw-posix-aio.h"
+#include "coremu-config.h"
+#include "coremu-hw.h"
+#include "coremu-thread.h"
+
struct qemu_paiocb {
BlockDriverAIOCB common;
@@ -302,10 +306,13 @@ static ssize_t handle_aiocb_rw(struct qemu_paiocb *aiocb)
static void *aio_thread(void *unused)
{
- pid_t pid;
+#ifdef CONFIG_COREMU
+ coremu_thread_setpriority(PRIO_PROCESS, 0, -21);
+#else
+ pid_t pid;
pid = getpid();
-
+#endif
while (1) {
struct qemu_paiocb *aiocb;
ssize_t ret = 0;
@@ -353,8 +360,11 @@ static void *aio_thread(void *unused)
aiocb->ret = ret;
idle_threads++;
mutex_unlock(&lock);
-
+#ifdef CONFIG_COREMU
+ coremu_signal_hw_thr(aiocb->ev_signo);
+#else
if (kill(pid, aiocb->ev_signo)) die("kill failed");
+#endif
}
idle_threads--;
@@ -499,6 +509,10 @@ static PosixAioState *posix_aio_state;
static void aio_signal_handler(int signum)
{
+#ifdef CONFIG_COREMU
+ coremu_assert_hw_thr("aio_signal_handler should only called by hw thr\n");
+#endif
+
if (posix_aio_state) {
char byte = 0;
ssize_t ret;
@@ -507,8 +521,9 @@ static void aio_signal_handler(int signum)
if (ret < 0 && errno != EAGAIN)
die("write()");
}
-
+#ifndef CONFIG_COREMU
qemu_service_io();
+#endif
}
static void paio_remove(struct qemu_paiocb *acb)
@@ -570,7 +585,11 @@ BlockDriverAIOCB *paio_submit(BlockDriverState *bs, int fd,
return NULL;
acb->aio_type = type;
acb->aio_fildes = fd;
+#ifdef CONFIG_COREMU
+ acb->ev_signo = COREMU_AIO_SIG;
+#else
acb->ev_signo = SIGUSR2;
+#endif
acb->async_context_id = get_async_context_id();
if (qiov) {
@@ -598,7 +617,11 @@ BlockDriverAIOCB *paio_ioctl(BlockDriverState *bs, int fd,
return NULL;
acb->aio_type = QEMU_AIO_IOCTL;
acb->aio_fildes = fd;
+#ifdef CONFIG_COREMU
+ acb->ev_signo = COREMU_AIO_SIG;
+#else
acb->ev_signo = SIGUSR2;
+#endif
acb->aio_offset = 0;
acb->aio_ioctl_buf = buf;
acb->aio_ioctl_cmd = req;
@@ -625,7 +648,11 @@ int paio_init(void)
sigfillset(&act.sa_mask);
act.sa_flags = 0; /* do not restart syscalls to interrupt select() */
act.sa_handler = aio_signal_handler;
+#ifdef CONFIG_COREMU
+ sigaction(COREMU_AIO_SIG, &act, NULL);
+#else
sigaction(SIGUSR2, &act, NULL);
+#endif
s->first_aio = NULL;
if (qemu_pipe(fds) == -1) {
diff --git a/qemu-timer.c b/qemu-timer.c
index bdc8206..fa50562 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -55,6 +55,14 @@
#include "qemu-timer.h"
+#include "coremu-config.h"
+#include "coremu-timer.h"
+#include "coremu-debug.h"
+#include "coremu-core.h"
+#include "coremu-hw.h"
+#include "cm-intr.h"
+#include "cm-timer.h"
+
/* Conversion factor from emulated instructions to virtual clock ticks. */
int icount_time_shift;
/* Arbitrarily pick 1MIPS as the minimum allowable speed. */
@@ -121,7 +129,7 @@ static void init_get_clock(void)
static int64_t get_clock(void)
{
#if defined(__linux__) || (defined(__FreeBSD__) && __FreeBSD_version >= 500000) \
- || defined(__DragonFly__) || defined(__FreeBSD_kernel__)
+ || defined(__DragonFly__) || defined(__FreeBSD_kernel__)
if (use_rt_clock) {
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
@@ -147,7 +155,7 @@ typedef struct TimersState {
int64_t dummy;
} TimersState;
-TimersState timers_state;
+COREMU_THREAD TimersState timers_state;
/* return the host CPU cycle counter and handle stop/restart */
int64_t cpu_get_ticks(void)
@@ -160,12 +168,14 @@ int64_t cpu_get_ticks(void)
} else {
int64_t ticks;
ticks = cpu_get_real_ticks();
+#ifndef CONFIG_COREMU
if (timers_state.cpu_ticks_prev > ticks) {
/* Note: non increasing ticks may happen if the host uses
software suspend */
timers_state.cpu_ticks_offset += timers_state.cpu_ticks_prev - ticks;
}
timers_state.cpu_ticks_prev = ticks;
+#endif
return ticks + timers_state.cpu_ticks_offset;
}
}
@@ -423,7 +433,7 @@ void configure_alarms(char const *opt)
/* Ignore */
goto next;
- /* Swap */
+ /* Swap */
tmp = alarm_timers[i];
alarm_timers[i] = alarm_timers[cur];
alarm_timers[cur] = tmp;
@@ -718,9 +728,12 @@ static void CALLBACK host_alarm_handler(UINT uTimerID, UINT uMsg,
static void host_alarm_handler(int host_signum)
#endif
{
+ //printf("host_alarm_handler\n");
+ coremu_assert_hw_thr("Host_alarm_handler should be called by hw thr\n");
+
struct qemu_alarm_timer *t = alarm_timer;
if (!t)
- return;
+ return;
#if 0
#define DISP_FREQ 1000
@@ -926,9 +939,27 @@ static int dynticks_start_timer(struct qemu_alarm_timer *t)
act.sa_flags = 0;
act.sa_handler = host_alarm_handler;
+#ifdef CONFIG_COREMU
+ int signo;
+ (void) ev;
+
+ if (coremu_hw_thr_p()) {
+ signo = COREMU_HARDWARE_ALARM;
+ sigaction(COREMU_HARDWARE_ALARM, &act, NULL);
+ } else {
+ /* Core signal handler is registerd before running all core. */
+ signo = COREMU_CORE_ALARM;
+ }
+
+ if (coremu_timer_create(signo, &host_timer)) {
+ perror("timer_create");
+ cm_assert(0, "timer create failed");
+ return -1;
+ }
+#else
sigaction(SIGALRM, &act, NULL);
- /*
+ /*
* Initialize ev struct to 0 to avoid valgrind complaining
* about uninitialized data in timer_create call
*/
@@ -945,7 +976,7 @@ static int dynticks_start_timer(struct qemu_alarm_timer *t)
return -1;
}
-
+#endif
t->priv = (void *)(long)host_timer;
return 0;
@@ -1160,7 +1191,7 @@ int qemu_calculate_timeout(void)
int64_t add;
int64_t delta;
/* Advance virtual time to the next event. */
- delta = qemu_icount_delta();
+ delta = qemu_icount_delta();
if (delta > 0) {
/* If virtual time is ahead of real time then just
wait for IO. */
@@ -1188,3 +1219,6 @@ int qemu_calculate_timeout(void)
#endif
}
+#ifdef CONFIG_COREMU
+#include "cm-timer.c"
+#endif
diff --git a/qemu-timer.h b/qemu-timer.h
index 1494f79..288cb27 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -2,7 +2,6 @@
#define QEMU_TIMER_H
#include "qemu-common.h"
-
/* timers */
typedef struct QEMUClock QEMUClock;
diff --git a/softmmu_template.h b/softmmu_template.h
index c2df9ec..6289480 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -18,6 +18,11 @@
*/
#include "qemu-timer.h"
+#if defined(TARGET_ARM)
+#include "coremu-spinlock.h"
+#include "cm-target-intr.h"
+#endif
+
#define DATA_SIZE (1 << SHIFT)
#if DATA_SIZE == 8
@@ -55,6 +60,9 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(target_phys_addr_t physaddr,
target_ulong addr,
void *retaddr)
{
+#if defined(CONFIG_COREMU) && defined(TARGET_ARM)
+ coremu_spin_lock(&cm_hw_lock);
+#endif
DATA_TYPE res;
int index;
index = (physaddr >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1);
@@ -77,6 +85,10 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(target_phys_addr_t physaddr,
res |= (uint64_t)io_mem_read[index][2](io_mem_opaque[index], physaddr + 4) << 32;
#endif
#endif /* SHIFT > 2 */
+
+#if defined(CONFIG_COREMU) && defined(TARGET_ARM)
+ coremu_spin_unlock(&cm_hw_lock);
+#endif
return res;
}
@@ -199,6 +211,9 @@ static inline void glue(io_write, SUFFIX)(target_phys_addr_t physaddr,
target_ulong addr,
void *retaddr)
{
+#if defined(CONFIG_COREMU) && defined(TARGET_ARM)
+ coremu_spin_lock(&cm_hw_lock);
+#endif
int index;
index = (physaddr >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1);
physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
@@ -220,6 +235,10 @@ static inline void glue(io_write, SUFFIX)(target_phys_addr_t physaddr,
io_mem_write[index][2](io_mem_opaque[index], physaddr + 4, val >> 32);
#endif
#endif /* SHIFT > 2 */
+
+#if defined(CONFIG_COREMU) && defined(TARGET_ARM)
+ coremu_spin_unlock(&cm_hw_lock);
+#endif
}
void REGPARM glue(glue(__st, SUFFIX), MMUSUFFIX)(target_ulong addr,
diff --git a/target-arm/cm-atomic.c b/target-arm/cm-atomic.c
new file mode 100644
index 0000000..9d57243
--- /dev/null
+++ b/target-arm/cm-atomic.c
@@ -0,0 +1,211 @@
+/*
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* We include this file in op_helper.c */
+
+#include <stdlib.h>
+#include <pthread.h>
+#include "coremu-atomic.h"
+#include "coremu-sched.h"
+#include "coremu-types.h"
+
+/* These definitions are copied from translate.c */
+#if defined(WORDS_BIGENDIAN)
+#define REG_B_OFFSET (sizeof(target_ulong) - 1)
+#define REG_H_OFFSET (sizeof(target_ulong) - 2)
+#define REG_W_OFFSET (sizeof(target_ulong) - 2)
+#define REG_L_OFFSET (sizeof(target_ulong) - 4)
+#define REG_LH_OFFSET (sizeof(target_ulong) - 8)
+#else
+#define REG_B_OFFSET 0
+#define REG_H_OFFSET 1
+#define REG_W_OFFSET 0
+#define REG_L_OFFSET 0
+#define REG_LH_OFFSET 4
+#endif
+
+#define REG_LOW_MASK (~(uint64_t)0x0>>32)
+
+/* gen_op instructions */
+/* i386 arith/logic operations */
+enum {
+ OP_ADDL,
+ OP_ORL,
+ OP_ADCL,
+ OP_SBBL,
+ OP_ANDL,
+ OP_SUBL,
+ OP_XORL,
+ OP_CMPL,
+};
+
+/* XXX: This function is not platform specific, move them to other place
+ * later. */
+
+/* Given the guest virtual address, get the corresponding host address.
+ * This macro resembles ldxxx in softmmu_template.h
+ * NOTE: This must be inlined since the use of GETPC needs to get the
+ * return address. Using always inline also works, we use macro here to be more
+ * explicit. */
+#define CM_GET_QEMU_ADDR(q_addr, v_addr) \
+do { \
+ int __mmu_idx, __index; \
+ CPUState *__env1 = cpu_single_env; \
+ void *__retaddr; \
+ __index = (v_addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); \
+ /* get the CPL, hence determine the MMU mode */ \
+ __mmu_idx = cpu_mmu_index(__env1); \
+ /* We use this function in the implementation of atomic instructions */ \
+ /* and we are going to modify these memory. So we use addr_write. */ \
+ if (unlikely(__env1->tlb_table[__mmu_idx][__index].addr_write \
+ != (v_addr & TARGET_PAGE_MASK))) { \
+ __retaddr = GETPC(); \
+ tlb_fill(v_addr, 1, __mmu_idx, __retaddr); \
+ } \
+ q_addr = v_addr + __env1->tlb_table[__mmu_idx][__index].addend; \
+} while(0)
+
+#define LD_b ldub_raw
+#define LD_w lduw_raw
+#define LD_l ldl_raw
+#define LD_q ldq_raw
+
+/* Lightweight transactional memory. */
+#define TX(vaddr, type, value, command) \
+ target_ulong __q_addr; \
+ DATA_##type __oldv; \
+ DATA_##type value; \
+ \
+ CM_GET_QEMU_ADDR(__q_addr, vaddr); \
+ do { \
+ __oldv = value = LD_##type((DATA_##type *)__q_addr); \
+ {command;}; \
+ mb(); \
+ } while (__oldv != (atomic_compare_exchange##type( \
+ (DATA_##type *)__q_addr, __oldv, value)))
+
+COREMU_THREAD uint64_t cm_exclusive_val;
+COREMU_THREAD uint32_t cm_exclusive_addr = -1;
+
+#define GEN_LOAD_EXCLUSIVE(type, TYPE) \
+void HELPER(load_exclusive##type)(uint32_t reg, uint32_t addr) \
+{ \
+ ram_addr_t q_addr = 0; \
+ DATA_##type val = 0; \
+ \
+ cm_exclusive_addr = addr; \
+ CM_GET_QEMU_ADDR(q_addr,addr); \
+ val = *(DATA_##type *)q_addr; \
+ cm_exclusive_val = val; \
+ cpu_single_env->regs[reg] = val; \
+}
+
+GEN_LOAD_EXCLUSIVE(b, B);
+GEN_LOAD_EXCLUSIVE(w, W);
+GEN_LOAD_EXCLUSIVE(l, L);
+//GEN_LOAD_EXCLUSIVE(q, Q);
+
+#define GEN_STORE_EXCLUSIVE(type, TYPE) \
+void HELPER(store_exclusive##type)(uint32_t res, uint32_t reg, uint32_t addr) \
+{ \
+ ram_addr_t q_addr = 0; \
+ DATA_##type val = 0; \
+ DATA_##type r = 0; \
+ \
+ if(addr != cm_exclusive_addr) \
+ goto fail; \
+ \
+ CM_GET_QEMU_ADDR(q_addr,addr); \
+ val = (DATA_##type)cpu_single_env->regs[reg]; \
+ \
+ r = atomic_compare_exchange##type((DATA_##type *)q_addr, \
+ (DATA_##type)cm_exclusive_val, val); \
+ \
+ if(r == (DATA_##type)cm_exclusive_val) { \
+ cpu_single_env->regs[res] = 0; \
+ goto done; \
+ } else { \
+ goto fail; \
+ } \
+ \
+fail: \
+ cpu_single_env->regs[res] = 1; \
+ \
+done: \
+ cm_exclusive_addr = -1; \
+ return; \
+}
+
+GEN_STORE_EXCLUSIVE(b, B);
+GEN_STORE_EXCLUSIVE(w, W);
+GEN_STORE_EXCLUSIVE(l, L);
+//GEN_STORE_EXCLUSIVE(q, Q);
+
+void HELPER(load_exclusiveq)(uint32_t reg, uint32_t addr)
+{
+ ram_addr_t q_addr = 0;
+ uint64_t val = 0;
+
+ cm_exclusive_addr = addr;
+ CM_GET_QEMU_ADDR(q_addr,addr);
+ val = *(uint64_t *)q_addr;
+ cm_exclusive_val = val;
+ cpu_single_env->regs[reg] = (uint32_t)val;
+ cpu_single_env->regs[reg + 1] = (uint32_t)(val>>32);
+}
+
+void HELPER(store_exclusiveq)(uint32_t res, uint32_t reg, uint32_t addr)
+{
+ ram_addr_t q_addr = 0;
+ uint64_t val = 0;
+ uint64_t r = 0;
+
+ if(addr != cm_exclusive_addr)
+ goto fail;
+
+ CM_GET_QEMU_ADDR(q_addr,addr);
+ val = (uint32_t)cpu_single_env->regs[reg];
+ val |= ((uint64_t)cpu_single_env->regs[reg + 1]) << 32;
+
+ r = atomic_compare_exchangeq((uint64_t *)q_addr,
+ (uint64_t)cm_exclusive_val, val);
+
+ if(r == (uint64_t)cm_exclusive_val) {
+ cpu_single_env->regs[res] = 0;
+ goto done;
+ } else {
+ goto fail;
+ }
+
+fail:
+ cpu_single_env->regs[res] = 1;
+
+done:
+ cm_exclusive_addr = -1;
+ return;
+}
+
+void HELPER(clear_exclusive)()
+{
+ cm_exclusive_addr = -1;
+}
diff --git a/target-arm/cm-atomic.h b/target-arm/cm-atomic.h
new file mode 100644
index 0000000..26ee256
--- /dev/null
+++ b/target-arm/cm-atomic.h
@@ -0,0 +1,34 @@
+/*
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define __GEN_HEADER(type) \
+DEF_HELPER_2(load_exclusive##type, void, i32, i32) \
+DEF_HELPER_3(store_exclusive##type, void, i32, i32, i32)
+
+__GEN_HEADER(b)
+__GEN_HEADER(w)
+__GEN_HEADER(l)
+__GEN_HEADER(q)
+
+DEF_HELPER_0(clear_exclusive, void)
+
diff --git a/target-arm/cm-target-intr.c b/target-arm/cm-target-intr.c
new file mode 100644
index 0000000..1627d28
--- /dev/null
+++ b/target-arm/cm-target-intr.c
@@ -0,0 +1,74 @@
+/*
+ * COREMU Parallel Emulator Framework
+ * The definition of interrupt related interface for i386
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include "cpu.h"
+#include "../hw/arm-misc.h"
+#include "coremu-intr.h"
+#include "coremu-malloc.h"
+#include "coremu-atomic.h"
+#include "coremu-spinlock.h"
+#include "cm-intr.h"
+#include "cm-target-intr.h"
+
+CMSpinLock cm_hw_lock;
+static void cm_gic_intr_handler(void *opaque)
+{
+ CMGICIntr *gic_intr = (CMGICIntr *) opaque;
+ switch (gic_intr->irq_num) {
+ case ARM_PIC_CPU_IRQ:
+ if (gic_intr->level)
+ cpu_interrupt(cpu_single_env, CPU_INTERRUPT_HARD);
+ else
+ cpu_reset_interrupt(cpu_single_env, CPU_INTERRUPT_HARD);
+ break;
+ case ARM_PIC_CPU_FIQ:
+ if (gic_intr->level)
+ cpu_interrupt(cpu_single_env, CPU_INTERRUPT_FIQ);
+ else
+ cpu_reset_interrupt(cpu_single_env, CPU_INTERRUPT_FIQ);
+ break;
+ default:
+ hw_error("arm_pic_cpu_handler: Bad interrput line %d\n",
+ gic_intr->irq_num);
+ }
+
+}
+
+static CMIntr *cm_gic_intr_init(int irq, int level)
+{
+ CMGICIntr *intr = coremu_mallocz(sizeof(*intr));
+ ((CMIntr *)intr)->handler = cm_gic_intr_handler;
+ intr->irq_num = irq;
+ intr->level = level;
+ return (CMIntr *)intr;
+}
+
+void cm_arm_pic_cpu_handler(void *opaque, int irq, int level)
+{
+ CPUState *env = (CPUState *)opaque;
+ coremu_send_intr(cm_gic_intr_init(irq, level), env->cpu_index);
+}
diff --git a/target-arm/cm-target-intr.h b/target-arm/cm-target-intr.h
new file mode 100644
index 0000000..7a2a148
--- /dev/null
+++ b/target-arm/cm-target-intr.h
@@ -0,0 +1,40 @@
+/*
+ * COREMU Parallel Emulator Framework
+ * The definition of interrupt related interface for i386
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef CM_ARM_INTR_H
+#define CM_ARM_INTR_H
+#include "cm-intr.h"
+#include "coremu-spinlock.h"
+
+typedef struct CMGICIntr {
+ CMIntr *base;
+ int irq_num;
+ int level;
+} CMGICIntr;
+
+extern CMSpinLock cm_hw_lock;
+void cm_arm_pic_cpu_handler(void *opaque, int irq, int level);
+
+#endif
diff --git a/target-arm/helper.c b/target-arm/helper.c
old mode 100644
new mode 100755
index 99e0394..efbb2fa
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -295,9 +295,12 @@ CPUARMState *cpu_arm_init(const char *cpu_model)
return NULL;
env = qemu_mallocz(sizeof(CPUARMState));
cpu_exec_init(env);
+
if (!inited) {
inited = 1;
+#ifndef CONFIG_COREMU
arm_translate_init();
+#endif
}
env->cpu_model_str = cpu_model;
@@ -314,6 +317,10 @@ CPUARMState *cpu_arm_init(const char *cpu_model)
19, "arm-vfp.xml", 0);
}
qemu_init_vcpu(env);
+
+#ifdef CONFIG_COREMU
+ coremu_core_init(env->cpu_index, env);
+#endif
return env;
}
diff --git a/target-arm/helpers.h b/target-arm/helpers.h
index 0d1bc47..9e07338 100644
--- a/target-arm/helpers.h
+++ b/target-arm/helpers.h
@@ -447,4 +447,9 @@ DEF_HELPER_3(iwmmxt_muladdswl, i64, i64, i32, i32)
DEF_HELPER_2(set_teecr, void, env, i32)
+#include "coremu-config.h"
+#ifdef CONFIG_COREMU
+#include "cm-atomic.h"
+#endif
+
#include "def-helper.h"
diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c
index 5e6452b..e58a8cd 100644
--- a/target-arm/neon_helper.c
+++ b/target-arm/neon_helper.c
@@ -18,7 +18,7 @@
#define SET_QC() env->vfp.xregs[ARM_VFP_FPSCR] = CPSR_Q
-static float_status neon_float_status;
+static COREMU_THREAD float_status neon_float_status;
#define NFS &neon_float_status
/* Helper routines to perform bitwise copies between float and int. */
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index 9b1a014..3503855 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -487,3 +487,8 @@ uint64_t HELPER(neon_sub_saturate_u64)(uint64_t src1, uint64_t src2)
}
return res;
}
+
+#include "coremu-config.h"
+#ifdef CONFIG_COREMU
+#include "cm-atomic.c"
+#endif
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 0eccca5..b6d07c5 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -72,21 +72,21 @@ typedef struct DisasContext {
#define DISAS_WFI 4
#define DISAS_SWI 5
-static TCGv_ptr cpu_env;
+static COREMU_THREAD TCGv_ptr cpu_env;
/* We reuse the same 64-bit temporaries for efficiency. */
-static TCGv_i64 cpu_V0, cpu_V1, cpu_M0;
-static TCGv_i32 cpu_R[16];
-static TCGv_i32 cpu_exclusive_addr;
-static TCGv_i32 cpu_exclusive_val;
-static TCGv_i32 cpu_exclusive_high;
+static COREMU_THREAD TCGv_i64 cpu_V0, cpu_V1, cpu_M0;
+static COREMU_THREAD TCGv_i32 cpu_R[16];
+static COREMU_THREAD TCGv_i32 cpu_exclusive_addr;
+static COREMU_THREAD TCGv_i32 cpu_exclusive_val;
+static COREMU_THREAD TCGv_i32 cpu_exclusive_high;
#ifdef CONFIG_USER_ONLY
static TCGv_i32 cpu_exclusive_test;
static TCGv_i32 cpu_exclusive_info;
#endif
/* FIXME: These should be removed. */
-static TCGv cpu_F0s, cpu_F1s;
-static TCGv_i64 cpu_F0d, cpu_F1d;
+static COREMU_THREAD TCGv cpu_F0s, cpu_F1s;
+static COREMU_THREAD TCGv_i64 cpu_F0d, cpu_F1d;
#include "gen-icount.h"
@@ -123,7 +123,7 @@ void arm_translate_init(void)
#include "helpers.h"
}
-static int num_temps;
+static COREMU_THREAD int num_temps;
/* Allocate a temporary variable. */
static TCGv_i32 new_tmp(void)
@@ -6026,6 +6026,12 @@ static void disas_arm_insn(CPUState * env, DisasContext *s)
TCGv tmp2;
TCGv tmp3;
TCGv addr;
+
+#ifdef CONFIG_COREMU
+ TCGv cm_tmp;
+ TCGv cm_tmp1;
+#endif
+
TCGv_i64 tmp64;
insn = ldl_code(s->pc);
@@ -6069,7 +6075,11 @@ static void disas_arm_insn(CPUState * env, DisasContext *s)
switch ((insn >> 4) & 0xf) {
case 1: /* clrex */
ARCH(6K);
+#ifdef CONFIG_COREMU
+ gen_helper_clear_exclusive();
+#else
gen_clrex(s);
+#endif
return;
case 4: /* dsb */
case 5: /* dmb */
@@ -6655,36 +6665,75 @@ static void disas_arm_insn(CPUState * env, DisasContext *s)
addr = tcg_temp_local_new_i32();
load_reg_var(s, addr, rn);
if (insn & (1 << 20)) {
+#ifdef CONFIG_COREMU
+ cm_tmp = tcg_const_i32(rd);
+#endif
switch (op1) {
case 0: /* ldrex */
- gen_load_exclusive(s, rd, 15, addr, 2);
+#ifdef CONFIG_COREMU
+ gen_helper_load_exclusivel(cm_tmp, addr);
+#else
+ gen_load_exclusive(s, rd, 15, addr, 2);
+#endif
break;
case 1: /* ldrexd */
+#ifdef CONFIG_COREMU
+ gen_helper_load_exclusiveq(cm_tmp, addr);
+#else
gen_load_exclusive(s, rd, rd + 1, addr, 3);
break;
+#endif
case 2: /* ldrexb */
+#ifdef CONFIG_COREMU
+ gen_helper_load_exclusiveb(cm_tmp, addr);
+#else
gen_load_exclusive(s, rd, 15, addr, 0);
+#endif
break;
case 3: /* ldrexh */
+#ifdef CONFIG_COREMU
+ gen_helper_load_exclusivew(cm_tmp, addr);
+#else
gen_load_exclusive(s, rd, 15, addr, 1);
+#endif
break;
default:
abort();
}
} else {
rm = insn & 0xf;
+#ifdef CONFIG_COREMU
+ cm_tmp = tcg_const_i32(rd);
+ cm_tmp1 = tcg_const_i32(rm);
+#endif
switch (op1) {
case 0: /* strex */
+#ifdef CONFIG_COREMU
+ gen_helper_store_exclusivel(cm_tmp, cm_tmp1, addr);
+#else
gen_store_exclusive(s, rd, rm, 15, addr, 2);
+#endif
break;
case 1: /* strexd */
+#ifdef CONFIG_COREMU
+ gen_helper_store_exclusiveq(cm_tmp, cm_tmp1, addr);
+#else
gen_store_exclusive(s, rd, rm, rm + 1, addr, 3);
+#endif
break;
case 2: /* strexb */
+#ifdef CONFIG_COREMU
+ gen_helper_store_exclusiveb(cm_tmp, cm_tmp1, addr);
+#else
gen_store_exclusive(s, rd, rm, 15, addr, 0);
+#endif
break;
case 3: /* strexh */
+#ifdef CONFIG_COREMU
+ gen_helper_store_exclusivew(cm_tmp, cm_tmp1, addr);
+#else
gen_store_exclusive(s, rd, rm, 15, addr, 1);
+#endif
break;
default:
abort();
@@ -7333,6 +7382,10 @@ static int disas_thumb2_insn(CPUState *env, DisasContext *s, uint16_t insn_hw1)
TCGv tmp;
TCGv tmp2;
TCGv tmp3;
+#ifdef CONFIG_COREMU
+ TCGv cm_tmp;
+ TCGv cm_tmp1;
+#endif
TCGv addr;
TCGv_i64 tmp64;
int op;
@@ -7445,9 +7498,20 @@ static int disas_thumb2_insn(CPUState *env, DisasContext *s, uint16_t insn_hw1)
load_reg_var(s, addr, rn);
tcg_gen_addi_i32(addr, addr, (insn & 0xff) << 2);
if (insn & (1 << 20)) {
+#ifdef CONFIG_COREMU
+ cm_tmp = tcg_const_i32(rs);
+ gen_helper_load_exclusivel(cm_tmp, addr);
+#else
gen_load_exclusive(s, rs, 15, addr, 2);
+#endif
} else {
+#ifdef CONFIG_COREMU
+ cm_tmp = tcg_const_i32(rd);
+ cm_tmp1 = tcg_const_i32(rs);
+ gen_helper_store_exclusivel(cm_tmp, cm_tmp1, addr);
+#else
gen_store_exclusive(s, rd, rs, 15, addr, 2);
+#endif
}
tcg_temp_free(addr);
} else if ((insn & (1 << 6)) == 0) {
diff --git a/target-i386/cm-atomic.c b/target-i386/cm-atomic.c
new file mode 100644
index 0000000..ecb9349
--- /dev/null
+++ b/target-i386/cm-atomic.c
@@ -0,0 +1,491 @@
+/*
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ * Xi Wu <wuxi@fudan.edu.cn>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* We include this file in op_helper.c */
+
+#include <stdlib.h>
+#include <pthread.h>
+#include "coremu-atomic.h"
+#include "coremu-sched.h"
+#include "coremu-types.h"
+
+/* These definitions are copied from translate.c */
+#if defined(WORDS_BIGENDIAN)
+#define REG_B_OFFSET (sizeof(target_ulong) - 1)
+#define REG_H_OFFSET (sizeof(target_ulong) - 2)
+#define REG_W_OFFSET (sizeof(target_ulong) - 2)
+#define REG_L_OFFSET (sizeof(target_ulong) - 4)
+#define REG_LH_OFFSET (sizeof(target_ulong) - 8)
+#else
+#define REG_B_OFFSET 0
+#define REG_H_OFFSET 1
+#define REG_W_OFFSET 0
+#define REG_L_OFFSET 0
+#define REG_LH_OFFSET 4
+#endif
+
+#define REG_LOW_MASK (~(uint64_t)0x0>>32)
+
+/* gen_op instructions */
+/* i386 arith/logic operations */
+enum {
+ OP_ADDL,
+ OP_ORL,
+ OP_ADCL,
+ OP_SBBL,
+ OP_ANDL,
+ OP_SUBL,
+ OP_XORL,
+ OP_CMPL,
+};
+
+/* XXX: This function is not platform specific, move them to other place
+ * later. */
+
+/* Given the guest virtual address, get the corresponding host address.
+ * This macro resembles ldxxx in softmmu_template.h
+ * NOTE: This must be inlined since the use of GETPC needs to get the
+ * return address. Using always inline also works, we use macro here to be more
+ * explicit. */
+#define CM_GET_QEMU_ADDR(q_addr, v_addr) \
+do { \
+ int __mmu_idx, __index; \
+ CPUState *__env1 = cpu_single_env; \
+ void *__retaddr; \
+ __index = (v_addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); \
+ /* get the CPL, hence determine the MMU mode */ \
+ __mmu_idx = cpu_mmu_index(__env1); \
+ /* We use this function in the implementation of atomic instructions */ \
+ /* and we are going to modify these memory. So we use addr_write. */ \
+ if (unlikely(__env1->tlb_table[__mmu_idx][__index].addr_write \
+ != (v_addr & TARGET_PAGE_MASK))) { \
+ __retaddr = GETPC(); \
+ tlb_fill(v_addr, 1, __mmu_idx, __retaddr); \
+ } \
+ q_addr = v_addr + __env1->tlb_table[__mmu_idx][__index].addend; \
+} while(0)
+
+static target_ulong cm_get_reg_val(int ot, int hregs, int reg)
+{
+ target_ulong val, offset;
+ CPUState *env1 = cpu_single_env;
+
+ switch(ot) {
+ case 0: /*OT_BYTE*/
+ if (reg < 4 || reg >= 8 || hregs) {
+ goto std_case;
+ } else {
+ offset = offsetof(CPUState, regs[reg - 4]) + REG_H_OFFSET;
+ val = *(((uint8_t *)env1) + offset);
+ }
+ break;
+ default:
+ std_case:
+ val = env1->regs[reg];
+ break;
+ }
+
+ return val;
+}
+
+static void cm_set_reg_val(int ot, int hregs, int reg, target_ulong val)
+{
+ target_ulong offset;
+
+ CPUState *env1 = cpu_single_env;
+
+ switch(ot) {
+ case 0: /* OT_BYTE */
+ if (reg < 4 || reg >= 8 || hregs) {
+ offset = offsetof(CPUState, regs[reg]) + REG_B_OFFSET;
+ *(((uint8_t *) env1) + offset) = (uint8_t)val;
+ } else {
+ offset = offsetof(CPUState, regs[reg - 4]) + REG_H_OFFSET;
+ *(((uint8_t *) env1) + offset) = (uint8_t)val;
+ }
+ break;
+ case 1: /* OT_WORD */
+ offset = offsetof(CPUState, regs[reg]) + REG_W_OFFSET;
+ *((uint16_t *)((uint8_t *)env1 + offset)) = (uint16_t)val;
+ break;
+ case 2: /* OT_LONG */
+ env1->regs[reg] = REG_LOW_MASK & val;
+ break;
+ default:
+ case 3: /* OT_QUAD */
+ env1->regs[reg] = val;
+ break;
+ }
+}
+
+#define LD_b ldub_raw
+#define LD_w lduw_raw
+#define LD_l ldl_raw
+#define LD_q ldq_raw
+
+/* Lightweight transactional memory. */
+#define TX(vaddr, type, value, command) \
+ target_ulong __q_addr; \
+ DATA_##type __oldv; \
+ DATA_##type value; \
+ \
+ CM_GET_QEMU_ADDR(__q_addr, vaddr); \
+ do { \
+ __oldv = value = LD_##type((DATA_##type *)__q_addr); \
+ {command;}; \
+ mb(); \
+ } while (__oldv != (atomic_compare_exchange##type( \
+ (DATA_##type *)__q_addr, __oldv, value)))
+
+/* Atomically emulate INC instruction using CAS1 and memory transaction. */
+
+#define GEN_ATOMIC_INC(type, TYPE) \
+void helper_atomic_inc##type(target_ulong a0, int c) \
+{ \
+ int eflags_c, eflags; \
+ int cc_op; \
+ \
+ /* compute the previous instruction c flags */ \
+ eflags_c = helper_cc_compute_c(CC_OP); \
+ \
+ TX(a0, type, value, { \
+ if (c > 0) { \
+ value++; \
+ cc_op = CC_OP_INC##TYPE; \
+ } else { \
+ value--; \
+ cc_op = CC_OP_DEC##TYPE; \
+ } \
+ }); \
+ \
+ CC_SRC = eflags_c; \
+ CC_DST = value; \
+ \
+ eflags = helper_cc_compute_all(cc_op); \
+ CC_SRC = eflags; \
+} \
+
+GEN_ATOMIC_INC(b, B);
+GEN_ATOMIC_INC(w, W);
+GEN_ATOMIC_INC(l, L);
+GEN_ATOMIC_INC(q, Q);
+
+#define OT_b 0
+#define OT_w 1
+#define OT_l 2
+#define OT_q 3
+
+#define GEN_XCHG(type) \
+void helper_xchg##type(target_ulong a0, int reg, int hreg) \
+{ \
+ DATA_##type val, out; \
+ target_ulong q_addr; \
+ \
+ CM_GET_QEMU_ADDR(q_addr, a0); \
+ val = (DATA_##type)cm_get_reg_val(OT_##type, hreg, reg); \
+ out = atomic_exchange##type((DATA_##type *)q_addr, val); \
+ mb(); \
+ \
+ cm_set_reg_val(OT_##type, hreg, reg, out); \
+}
+
+GEN_XCHG(b);
+GEN_XCHG(w);
+GEN_XCHG(l);
+GEN_XCHG(q);
+
+#define GEN_OP(type, TYPE) \
+void helper_atomic_op##type(target_ulong a0, target_ulong t1, \
+ int op) \
+{ \
+ DATA_##type operand; \
+ int eflags_c, eflags; \
+ int cc_op; \
+ \
+ /* compute the previous instruction c flags */ \
+ eflags_c = helper_cc_compute_c(CC_OP); \
+ operand = (DATA_##type)t1; \
+ \
+ TX(a0, type, value, { \
+ switch(op) { \
+ case OP_ADCL: \
+ value += operand + eflags_c; \
+ cc_op = CC_OP_ADD##TYPE + (eflags_c << 2); \
+ CC_SRC = operand; \
+ break; \
+ case OP_SBBL: \
+ value = value - operand - eflags_c; \
+ cc_op = CC_OP_SUB##TYPE + (eflags_c << 2); \
+ CC_SRC = operand; \
+ break; \
+ case OP_ADDL: \
+ value += operand; \
+ cc_op = CC_OP_ADD##TYPE; \
+ CC_SRC = operand; \
+ break; \
+ case OP_SUBL: \
+ value -= operand; \
+ cc_op = CC_OP_SUB##TYPE; \
+ CC_SRC = operand; \
+ break; \
+ default: \
+ case OP_ANDL: \
+ value &= operand; \
+ cc_op = CC_OP_LOGIC##TYPE; \
+ break; \
+ case OP_ORL: \
+ value |= operand; \
+ cc_op = CC_OP_LOGIC##TYPE; \
+ break; \
+ case OP_XORL: \
+ value ^= operand; \
+ cc_op = CC_OP_LOGIC##TYPE; \
+ break; \
+ case OP_CMPL: \
+ abort(); \
+ break; \
+ } \
+ }); \
+ CC_DST = value; \
+ /* successful transaction, compute the eflags */ \
+ eflags = helper_cc_compute_all(cc_op); \
+ CC_SRC = eflags; \
+}
+
+GEN_OP(b, B);
+GEN_OP(w, W);
+GEN_OP(l, L);
+GEN_OP(q, Q);
+
+/* xadd */
+#define GEN_XADD(type, TYPE) \
+void helper_atomic_xadd##type(target_ulong a0, int reg, \
+ int hreg) \
+{ \
+ DATA_##type operand, oldv; \
+ int eflags; \
+ \
+ operand = (DATA_##type)cm_get_reg_val( \
+ OT_##type, hreg, reg); \
+ \
+ TX(a0, type, newv, { \
+ oldv = newv; \
+ newv += operand; \
+ }); \
+ \
+ /* transaction successes */ \
+ /* xchg the register and compute the eflags */ \
+ cm_set_reg_val(OT_##type, hreg, reg, oldv); \
+ CC_SRC = oldv; \
+ CC_DST = newv; \
+ \
+ eflags = helper_cc_compute_all(CC_OP_ADD##TYPE); \
+ CC_SRC = eflags; \
+}
+
+GEN_XADD(b, B);
+GEN_XADD(w, W);
+GEN_XADD(l, L);
+GEN_XADD(q, Q);
+
+/* cmpxchg */
+#define GEN_CMPXCHG(type, TYPE) \
+void helper_atomic_cmpxchg##type(target_ulong a0, int reg, \
+ int hreg) \
+{ \
+ DATA_##type reg_v, eax_v, res; \
+ int eflags; \
+ target_ulong q_addr; \
+ \
+ CM_GET_QEMU_ADDR(q_addr, a0); \
+ reg_v = (DATA_##type)cm_get_reg_val(OT_##type, hreg, reg); \
+ eax_v = (DATA_##type)cm_get_reg_val(OT_##type, 0, R_EAX); \
+ \
+ res = atomic_compare_exchange##type( \
+ (DATA_##type *)q_addr, eax_v, reg_v); \
+ mb(); \
+ \
+ if (res != eax_v) \
+ cm_set_reg_val(OT_##type, 0, R_EAX, res); \
+ \
+ CC_SRC = res; \
+ CC_DST = eax_v - res; \
+ \
+ eflags = helper_cc_compute_all(CC_OP_SUB##TYPE); \
+ CC_SRC = eflags; \
+}
+
+GEN_CMPXCHG(b, B);
+GEN_CMPXCHG(w, W);
+GEN_CMPXCHG(l, L);
+GEN_CMPXCHG(q, Q);
+
+/* cmpxchgb (8, 16) */
+void helper_atomic_cmpxchg8b(target_ulong a0)
+{
+ uint64_t edx_eax, ecx_ebx, res;
+ int eflags;
+ target_ulong q_addr;
+
+ eflags = helper_cc_compute_all(CC_OP);
+ CM_GET_QEMU_ADDR(q_addr, a0);
+
+ edx_eax = (((uint64_t)EDX << 32) | (uint32_t)EAX);
+ ecx_ebx = (((uint64_t)ECX << 32) | (uint32_t)EBX);
+
+ res = atomic_compare_exchangeq((uint64_t *)q_addr, edx_eax, ecx_ebx);
+ mb();
+
+ if (res == edx_eax) {
+ eflags |= CC_Z;
+ } else {
+ EDX = (uint32_t)(res >> 32);
+ EAX = (uint32_t)res;
+ eflags &= ~CC_Z;
+ }
+
+ CC_SRC = eflags;
+}
+
+void helper_atomic_cmpxchg16b(target_ulong a0)
+{
+ uint8_t res;
+ int eflags;
+ target_ulong q_addr;
+
+ eflags = helper_cc_compute_all(CC_OP);
+ CM_GET_QEMU_ADDR(q_addr, a0);
+
+ uint64_t old_rax = *(uint64_t *)q_addr;
+ uint64_t old_rdx = *(uint64_t *)(q_addr + 8);
+ res = atomic_compare_exchange16b((uint64_t *)q_addr, EAX, EDX, EBX, ECX);
+ mb();
+
+ if (res) {
+ eflags |= CC_Z; /* swap success */
+ } else {
+ EDX = old_rdx;
+ EAX = old_rax;
+ eflags &= ~CC_Z; /* read the old value ! */
+ }
+
+ CC_SRC = eflags;
+}
+
+/* not */
+#define GEN_NOT(type) \
+void helper_atomic_not##type(target_ulong a0) \
+{ \
+ TX(a0, type, value, { \
+ value = ~value; \
+ }); \
+}
+
+GEN_NOT(b);
+GEN_NOT(w);
+GEN_NOT(l);
+GEN_NOT(q);
+
+/* neg */
+#define GEN_NEG(type, TYPE) \
+void helper_atomic_neg##type(target_ulong a0) \
+{ \
+ int eflags; \
+ \
+ TX(a0, type, value, { \
+ value = -value; \
+ }); \
+ \
+ /* We should use the old value to compute CC */ \
+ CC_SRC = CC_DST = -value; \
+ \
+ eflags = helper_cc_compute_all(CC_OP_SUB##TYPE); \
+ CC_SRC = eflags; \
+} \
+
+GEN_NEG(b, B);
+GEN_NEG(w, W);
+GEN_NEG(l, L);
+GEN_NEG(q, Q);
+
+/* This is only used in BTX instruction, with an additional offset.
+ * Note that, when using register bitoffset, the value can be larger than
+ * operand size - 1 (operand size can be 16/32/64), refer to intel manual 2A
+ * page 3-11. */
+#define TX2(vaddr, type, value, offset, command) \
+ target_ulong __q_addr; \
+ DATA_##type __oldv; \
+ DATA_##type value; \
+ \
+ CM_GET_QEMU_ADDR(__q_addr, vaddr); \
+ __q_addr += offset >> 3; \
+ do { \
+ __oldv = value = LD_##type((DATA_##type *)__q_addr); \
+ {command;}; \
+ mb(); \
+ } while (__oldv != (atomic_compare_exchange##type( \
+ (DATA_##type *)__q_addr, __oldv, value)))
+
+#define GEN_BTX(ins, command) \
+void helper_atomic_##ins(target_ulong a0, target_ulong offset, \
+ int ot) \
+{ \
+ uint8_t old_byte; \
+ int eflags; \
+ \
+ TX2(a0, b, value, offset, { \
+ old_byte = value; \
+ {command;}; \
+ }); \
+ \
+ CC_SRC = (old_byte >> (offset & 0x7)); \
+ CC_DST = 0; \
+ eflags = helper_cc_compute_all(CC_OP_SARB + ot); \
+ CC_SRC = eflags; \
+}
+
+/* bts */
+GEN_BTX(bts, {
+ value |= (1 << (offset & 0x7));
+});
+/* btr */
+GEN_BTX(btr, {
+ value &= ~(1 << (offset & 0x7));
+});
+/* btc */
+GEN_BTX(btc, {
+ value ^= (1 << (offset & 0x7));
+});
+
+/* fence **/
+void helper_fence(void)
+{
+ mb();
+}
+
+/* pause */
+void helper_pause(void)
+{
+ coremu_cpu_sched(CM_EVENT_PAUSE);
+}
diff --git a/target-i386/cm-atomic.h b/target-i386/cm-atomic.h
new file mode 100644
index 0000000..f888231
--- /dev/null
+++ b/target-i386/cm-atomic.h
@@ -0,0 +1,50 @@
+/*
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define __GEN_HEADER(type) \
+DEF_HELPER_2(atomic_inc##type, void, tl, int) \
+DEF_HELPER_3(xchg##type, void, tl, int, int) \
+DEF_HELPER_3(atomic_op##type, void, tl, tl, int) \
+DEF_HELPER_3(atomic_xadd##type, void, tl, int, int) \
+DEF_HELPER_3(atomic_cmpxchg##type, void, tl, int, int) \
+DEF_HELPER_1(atomic_not##type, void, tl) \
+DEF_HELPER_1(atomic_neg##type, void, tl)
+
+__GEN_HEADER(b)
+__GEN_HEADER(w)
+__GEN_HEADER(l)
+__GEN_HEADER(q)
+
+DEF_HELPER_1(atomic_cmpxchg8b, void, tl)
+DEF_HELPER_1(atomic_cmpxchg16b, void, tl)
+
+DEF_HELPER_3(atomic_bts, void, tl, tl, int)
+DEF_HELPER_3(atomic_btr, void, tl, tl, int)
+DEF_HELPER_3(atomic_btc, void, tl, tl, int)
+
+/* fence */
+DEF_HELPER_0(fence, void)
+
+/* pause */
+DEF_HELPER_0(pause, void)
+
diff --git a/target-i386/cm-target-intr.c b/target-i386/cm-target-intr.c
new file mode 100644
index 0000000..7d8a21d
--- /dev/null
+++ b/target-i386/cm-target-intr.c
@@ -0,0 +1,162 @@
+/*
+ * COREMU Parallel Emulator Framework
+ * The definition of interrupt related interface for i386
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include "cpu.h"
+#include "../hw/apic.h"
+
+#include "coremu-intr.h"
+#include "coremu-malloc.h"
+#include "coremu-atomic.h"
+#include "cm-intr.h"
+#include "cm-target-intr.h"
+
+/* The initial function for interrupts */
+
+static CMIntr *cm_pic_intr_init(int level)
+{
+ CMPICIntr *intr = coremu_mallocz(sizeof(*intr));
+ ((CMIntr *)intr)->handler = cm_pic_intr_handler;
+
+ intr->level = level;
+
+ return (CMIntr *)intr;
+}
+
+static CMIntr *cm_apicbus_intr_init(int mask, int vector_num, int trigger_mode)
+{
+ CMAPICBusIntr *intr = coremu_mallocz(sizeof(*intr));
+ ((CMIntr *)intr)->handler = cm_apicbus_intr_handler;
+
+ intr->mask = mask;
+ intr->vector_num = vector_num;
+ intr->trigger_mode = trigger_mode;
+
+ return (CMIntr *)intr;
+}
+
+static CMIntr *cm_ipi_intr_init(int vector_num, int deliver_mode)
+{
+ CMIPIIntr *intr = coremu_mallocz(sizeof(*intr));
+ ((CMIntr *)intr)->handler = cm_ipi_intr_handler;
+
+ intr->vector_num = vector_num;
+ intr->deliver_mode = deliver_mode;
+
+ return (CMIntr *)intr;
+}
+
+static CMIntr *cm_tlb_flush_req_init(void)
+{
+ CMTLBFlushReq *intr = coremu_mallocz(sizeof(*intr));
+ ((CMIntr *)intr)->handler = cm_tlb_flush_req_handler;
+
+ return (CMIntr *)intr;
+}
+
+void cm_send_pic_intr(int target, int level)
+{
+ coremu_send_intr(cm_pic_intr_init(level), target);
+}
+
+void cm_send_apicbus_intr(int target, int mask,
+ int vector_num, int trigger_mode)
+{
+ coremu_send_intr(cm_apicbus_intr_init(mask, vector_num, trigger_mode),
+ target);
+}
+
+void cm_send_ipi_intr(int target, int vector_num, int deliver_mode)
+{
+ coremu_send_intr(cm_ipi_intr_init(vector_num, deliver_mode), target);
+}
+
+void cm_send_tlb_flush_req(int target)
+{
+ assert(0);
+ coremu_send_intr(cm_tlb_flush_req_init(), target);
+}
+
+/* Handle the interrupt from the i8259 chip */
+void cm_pic_intr_handler(void *opaque)
+{
+ CMPICIntr *pic_intr = (CMPICIntr *) opaque;
+
+ CPUState *self = cpu_single_env;
+ int level = pic_intr->level;
+
+ if (self->apic_state) {
+ if (apic_accept_pic_intr(self))
+ apic_deliver_pic_intr(self, pic_intr->level);
+ } else {
+ if (level)
+ cpu_interrupt(self, CPU_INTERRUPT_HARD);
+ else
+ cpu_reset_interrupt(self, CPU_INTERRUPT_HARD);
+ }
+}
+
+/* Handle the interrupt from the apic bus.
+ Because hardware connect to ioapic and inter-processor interrupt
+ are all delivered through apic bus, so this kind of interrupt can
+ be hw interrupt or IPI */
+void cm_apicbus_intr_handler(void *opaque)
+{
+ CMAPICBusIntr *apicbus_intr = (CMAPICBusIntr *)opaque;
+
+ CPUState *self = cpu_single_env;
+
+ if (apicbus_intr->vector_num >= 0) {
+ cm_apic_set_irq(self->apic_state, apicbus_intr->vector_num,
+ apicbus_intr->trigger_mode);
+ } else {
+ /* For NMI, SMI and INIT the vector information is ignored */
+ cpu_interrupt(self, apicbus_intr->mask);
+ }
+}
+
+/* Handle the inter-processor interrupt (Only for INIT De-assert or SIPI) */
+void cm_ipi_intr_handler(void *opaque)
+{
+ CMIPIIntr *ipi_intr = (CMIPIIntr *)opaque;
+
+ CPUState *self = cpu_single_env;
+
+ if (ipi_intr->deliver_mode) {
+ /* SIPI */
+ cm_apic_startup(self->apic_state, ipi_intr->vector_num);
+ } else {
+ /* the INIT level de-assert */
+ cm_apic_setup_arbid(self->apic_state);
+ }
+}
+
+
+/* Handler the tlb flush request */
+void cm_tlb_flush_req_handler(void *opaque)
+{
+ tlb_flush(cpu_single_env, 1);
+}
diff --git a/target-i386/cm-target-intr.h b/target-i386/cm-target-intr.h
new file mode 100644
index 0000000..aa64215
--- /dev/null
+++ b/target-i386/cm-target-intr.h
@@ -0,0 +1,96 @@
+/*
+ * COREMU Parallel Emulator Framework
+ * Defines qemu related structure and interface for i386 architecture.
+ *
+ * Copyright (C) 2010 Parallel Processing Institute (PPI), Fudan Univ.
+ * <http://ppi.fudan.edu.cn/system_research_group>
+ *
+ * Authors:
+ * Zhaoguo Wang <zgwang@fudan.edu.cn>
+ * Yufei Chen <chenyufei@fudan.edu.cn>
+ * Ran Liu <naruilone@gmail.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+ /* Interrupt types for i386 architecture */
+#ifndef CM_I386_INTR_H
+#define CM_I386_INTR_H
+
+#include "cm-intr.h"
+
+enum cm_i386_intr_type {
+ PIC_INTR, /* Interrupt from i8259 pic */
+ APICBUS_INTR, /* Interrupt from APIC BUS
+ can be issued by other core or ioapic */
+ IPI_INTR, /* Interrupt from other core
+ Only for de-assert INIT and SIPI */
+ DIRECT_INTR, /* Direct interrupt (SMI) */
+ SHUTDOWN_REQ, /* Shut down request */
+ TLB_FLUSH_REQ, /* This kind of request does not exist in real world,
+ we do this is just to confirm to Qemu framework */
+};
+
+
+/* Interrupt infomation for i8259 pic */
+typedef struct CMPICIntr {
+ CMIntr *base;
+ int level; /* the level of interrupt */
+} CMPICIntr;
+
+
+/* Interrupt information for IOAPIC */
+typedef struct CMAPICBusIntr {
+ CMIntr *base;
+ int mask; /* Qemu will use this to check which
+ kind of interrupt is issued */
+ int vector_num; /* The interrupt vector number
+ If the vector number is -1, it indicates
+ the vector information is ignored (SMI, NMI, INIT) */
+ int trigger_mode; /* The trigger mode of interrupt */
+} CMAPICBusIntr;
+
+
+typedef struct CMIPIIntr {
+ CMIntr *base;
+ int vector_num; /* The interrupt vector number */
+ int deliver_mode; /* The deliver mode of interrupt
+ 0: INIT Level De-assert
+ 1: Start up IPI */
+} CMIPIIntr;
+
+typedef struct CMTLBFlushReq {
+ CMIntr *base;
+} CMTLBFlushReq;
+
+/* The declaration for apic wrapper function */
+void cm_apic_set_irq(struct APICState *s, int vector_num, int trigger_mode);
+void cm_apic_startup(struct APICState *s, int vector_num);
+void cm_apic_setup_arbid(struct APICState *s);
+
+/* The declaration for pic wrapper function */
+void cm_pic_irq_request(void *opaque, int irq, int level);
+
+/* The common declaration */
+void cm_send_pic_intr(int target, int level);
+void cm_send_apicbus_intr(int target, int mask, int vector_num, int
+ trigger_mode);
+void cm_send_ipi_intr(int target, int vector_num, int deliver_mode);
+void cm_send_tlb_flush_req(int target);
+
+void cm_pic_intr_handler(void *opaque);
+void cm_apicbus_intr_handler(void *opaque);
+void cm_ipi_intr_handler(void *opaque);
+void cm_tlb_flush_req_handler(void *opaque);
+#endif
diff --git a/target-i386/helper.c b/target-i386/helper.c
index c9508a8..1c487aa 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -1127,7 +1127,10 @@ CPUX86State *cpu_x86_init(const char *cpu_model)
/* init various static tables */
if (!inited) {
inited = 1;
+#ifndef CONFIG_COREMU
+ /* For coremu, this is called in cm_cpu_exec_init_core. */
optimize_flags_init();
+#endif
#ifndef CONFIG_USER_ONLY
prev_debug_excp_handler =
cpu_set_debug_excp_handler(breakpoint_handler);
diff --git a/target-i386/helper.h b/target-i386/helper.h
index 6b518ad..c75a441 100644
--- a/target-i386/helper.h
+++ b/target-i386/helper.h
@@ -217,4 +217,9 @@ DEF_HELPER_2(rclq, tl, tl, tl)
DEF_HELPER_2(rcrq, tl, tl, tl)
#endif
+#include "coremu-config.h"
+#ifdef CONFIG_COREMU
+#include "cm-atomic.h"
+#endif
+
#include "def-helper.h"
diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c
index dcbdfe7..121e739 100644
--- a/target-i386/op_helper.c
+++ b/target-i386/op_helper.c
@@ -5662,3 +5662,8 @@ uint32_t helper_cc_compute_c(int op)
#endif
}
}
+
+#include "coremu-config.h"
+#ifdef CONFIG_COREMU
+#include "cm-atomic.c"
+#endif
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 38c6016..f5f0fab 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -27,6 +27,7 @@
#include "exec-all.h"
#include "disas.h"
#include "tcg-op.h"
+#include "coremu-sched.h"
#include "helper.h"
#define GEN_HELPER 1
@@ -59,25 +60,25 @@
//#define MACRO_TEST 1
/* global register indexes */
-static TCGv_ptr cpu_env;
-static TCGv cpu_A0, cpu_cc_src, cpu_cc_dst, cpu_cc_tmp;
-static TCGv_i32 cpu_cc_op;
-static TCGv cpu_regs[CPU_NB_REGS];
+static COREMU_THREAD TCGv_ptr cpu_env;
+static COREMU_THREAD TCGv cpu_A0, cpu_cc_src, cpu_cc_dst, cpu_cc_tmp;
+static COREMU_THREAD TCGv_i32 cpu_cc_op;
+static COREMU_THREAD TCGv cpu_regs[CPU_NB_REGS];
/* local temps */
-static TCGv cpu_T[2], cpu_T3;
+static COREMU_THREAD TCGv cpu_T[2], cpu_T3;
/* local register indexes (only used inside old micro ops) */
-static TCGv cpu_tmp0, cpu_tmp4;
-static TCGv_ptr cpu_ptr0, cpu_ptr1;
-static TCGv_i32 cpu_tmp2_i32, cpu_tmp3_i32;
-static TCGv_i64 cpu_tmp1_i64;
-static TCGv cpu_tmp5;
+static COREMU_THREAD TCGv cpu_tmp0, cpu_tmp4;
+static COREMU_THREAD TCGv_ptr cpu_ptr0, cpu_ptr1;
+static COREMU_THREAD TCGv_i32 cpu_tmp2_i32, cpu_tmp3_i32;
+static COREMU_THREAD TCGv_i64 cpu_tmp1_i64;
+static COREMU_THREAD TCGv cpu_tmp5;
-static uint8_t gen_opc_cc_op[OPC_BUF_SIZE];
+static COREMU_THREAD uint8_t gen_opc_cc_op[OPC_BUF_SIZE];
#include "gen-icount.h"
#ifdef TARGET_X86_64
-static int x86_64_hregs;
+static COREMU_THREAD int x86_64_hregs;
#endif
typedef struct DisasContext {
@@ -1307,6 +1308,31 @@ static void gen_helper_fp_arith_STN_ST0(int op, int opreg)
/* if d == OR_TMP0, it means memory operand (address in A0) */
static void gen_op(DisasContext *s1, int op, int ot, int d)
{
+#ifdef CONFIG_COREMU
+ if (s1->prefix & PREFIX_LOCK) {
+ if (s1->cc_op != CC_OP_DYNAMIC)
+ gen_op_set_cc_op(s1->cc_op);
+
+ switch (ot & 3) {
+ case 0:
+ gen_helper_atomic_opb(cpu_A0,cpu_T[1], tcg_const_i32(op));
+ break;
+ case 1:
+ gen_helper_atomic_opw(cpu_A0,cpu_T[1], tcg_const_i32(op));
+ break;
+ case 2:
+ gen_helper_atomic_opl(cpu_A0,cpu_T[1], tcg_const_i32(op));
+ break;
+ default:
+#ifdef TARGET_X86_64
+ case 3:
+ gen_helper_atomic_opq(cpu_A0,cpu_T[1], tcg_const_i32(op));
+#endif
+ }
+ s1->cc_op = CC_OP_EFLAGS;
+ return;
+ }
+#endif
if (d != OR_TMP0) {
gen_op_mov_TN_reg(ot, 0, d);
} else {
@@ -1403,6 +1429,37 @@ static void gen_op(DisasContext *s1, int op, int ot, int d)
/* if d == OR_TMP0, it means memory operand (address in A0) */
static void gen_inc(DisasContext *s1, int ot, int d, int c)
{
+#ifdef CONFIG_COREMU
+ /* with lock prefix */
+ if (s1->prefix & PREFIX_LOCK) {
+ assert(d == OR_TMP0);
+
+ /* The helper will use CAS1 as a unified way to
+ implement atomic inc (locked inc) */
+ if (s1->cc_op != CC_OP_DYNAMIC)
+ gen_op_set_cc_op(s1->cc_op);
+
+ switch(ot & 3) {
+ case 0:
+ gen_helper_atomic_incb(cpu_A0, tcg_const_i32(c));
+ break;
+ case 1:
+ gen_helper_atomic_incw(cpu_A0, tcg_const_i32(c));
+ break;
+ case 2:
+ gen_helper_atomic_incl(cpu_A0, tcg_const_i32(c));
+ break;
+ default:
+#ifdef TARGET_X86_64
+ case 3:
+ gen_helper_atomic_incq(cpu_A0, tcg_const_i32(c));
+#endif
+ }
+ s1->cc_op = CC_OP_EFLAGS;
+ return;
+ }
+#endif
+
if (d != OR_TMP0)
gen_op_mov_TN_reg(ot, 0, d);
else
@@ -2712,7 +2769,7 @@ static void gen_eob(DisasContext *s)
if (s->singlestep_enabled) {
gen_helper_debug();
} else if (s->tf) {
- gen_helper_single_step();
+ gen_helper_single_step();
} else {
tcg_gen_exit_tb(0);
}
@@ -4208,9 +4265,13 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
s->aflag = aflag;
s->dflag = dflag;
+#ifndef CONFIG_COREMU
+ /* In coremu, atomic instructions are emulated by light-weight memory
+ * transaction, so there's no need to use lock. */
/* lock generation */
if (prefixes & PREFIX_LOCK)
gen_helper_lock();
+#endif
/* now check op code */
reswitch:
@@ -4372,6 +4433,30 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
s->cc_op = CC_OP_LOGICB + ot;
break;
case 2: /* not */
+#ifdef CONFIG_COREMU
+ if (s->prefix & PREFIX_LOCK) {
+ if (mod == 3)
+ goto illegal_op;
+
+ switch(ot & 3) {
+ case 0:
+ gen_helper_atomic_notb(cpu_A0);
+ break;
+ case 1:
+ gen_helper_atomic_notw(cpu_A0);
+ break;
+ case 2:
+ gen_helper_atomic_notl(cpu_A0);
+ break;
+ default:
+#ifdef TARGET_X86_64
+ case 3:
+ gen_helper_atomic_notq(cpu_A0);
+#endif
+ }
+ break;
+ }
+#endif
tcg_gen_not_tl(cpu_T[0], cpu_T[0]);
if (mod != 3) {
gen_op_st_T0_A0(ot + s->mem_index);
@@ -4380,6 +4465,34 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
}
break;
case 3: /* neg */
+#ifdef CONFIG_COREMU
+ if (s->prefix & PREFIX_LOCK) {
+ if (mod == 3)
+ goto illegal_op;
+
+ if (s->cc_op != CC_OP_DYNAMIC)
+ gen_op_set_cc_op(s->cc_op);
+
+ switch(ot & 3) {
+ case 0:
+ gen_helper_atomic_negb(cpu_A0);
+ break;
+ case 1:
+ gen_helper_atomic_negw(cpu_A0);
+ break;
+ case 2:
+ gen_helper_atomic_negl(cpu_A0);
+ break;
+ default:
+#ifdef TARGET_X86_64
+ case 3:
+ gen_helper_atomic_negq(cpu_A0);
+#endif
+ }
+ s->cc_op = CC_OP_EFLAGS;
+ break;
+ }
+#endif
tcg_gen_neg_tl(cpu_T[0], cpu_T[0]);
if (mod != 3) {
gen_op_st_T0_A0(ot + s->mem_index);
@@ -4834,7 +4947,38 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
gen_op_addl_T0_T1();
gen_op_mov_reg_T1(ot, reg);
gen_op_mov_reg_T0(ot, rm);
- } else {
+ } else
+#ifdef CONFIG_COREMU
+ if (s->prefix & PREFIX_LOCK) {
+ gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
+ if (s->cc_op != CC_OP_DYNAMIC)
+ gen_op_set_cc_op(s->cc_op);
+
+ switch (ot & 3) {
+ case 0:
+ gen_helper_atomic_xaddb(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ case 1:
+ gen_helper_atomic_xaddw(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ case 2:
+ gen_helper_atomic_xaddl(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ default:
+#ifdef TARGET_X86_64
+ case 3:
+ gen_helper_atomic_xaddq(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+#endif
+ }
+ s->cc_op = CC_OP_EFLAGS;
+ break;
+ } else
+#endif
+ {
gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
gen_op_mov_TN_reg(ot, 0, reg);
gen_op_ld_T1_A0(ot + s->mem_index);
@@ -4858,6 +5002,41 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
modrm = ldub_code(s->pc++);
reg = ((modrm >> 3) & 7) | rex_r;
mod = (modrm >> 6) & 3;
+
+#ifdef CONFIG_COREMU
+ if (s->prefix & PREFIX_LOCK) {
+ if (mod == 3)
+ goto illegal_op;
+
+ gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
+
+ if (s->cc_op != CC_OP_DYNAMIC)
+ gen_op_set_cc_op(s->cc_op);
+
+ switch(ot & 3) {
+ case 0:
+ gen_helper_atomic_cmpxchgb(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ case 1:
+ gen_helper_atomic_cmpxchgw(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ case 2:
+ gen_helper_atomic_cmpxchgl(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ default:
+#ifdef TARGET_X86_64
+ case 3:
+ gen_helper_atomic_cmpxchgq(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+#endif
+ }
+ s->cc_op = CC_OP_EFLAGS;
+ break;
+ }
+#endif
t0 = tcg_temp_local_new();
t1 = tcg_temp_local_new();
t2 = tcg_temp_local_new();
@@ -4912,9 +5091,14 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
if (s->cc_op != CC_OP_DYNAMIC)
gen_op_set_cc_op(s->cc_op);
gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
+#ifdef CONFIG_COREMU
+ if (s->prefix | PREFIX_LOCK) {
+ gen_helper_atomic_cmpxchg16b(cpu_A0);
+ } else
+#endif
gen_helper_cmpxchg16b(cpu_A0);
} else
-#endif
+#endif
{
if (!(s->cpuid_features & CPUID_CX8))
goto illegal_op;
@@ -4922,6 +5106,11 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
if (s->cc_op != CC_OP_DYNAMIC)
gen_op_set_cc_op(s->cc_op);
gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
+#ifdef CONFIG_COREMU
+ if (s->prefix | PREFIX_LOCK) {
+ gen_helper_atomic_cmpxchg8b(cpu_A0);
+ } else
+#endif
gen_helper_cmpxchg8b(cpu_A0);
}
s->cc_op = CC_OP_EFLAGS;
@@ -5315,15 +5504,43 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
gen_op_mov_reg_T1(ot, reg);
} else {
gen_lea_modrm(s, modrm, ®_addr, &offset_addr);
+
+#ifdef CONFIG_COREMU
+ /* for xchg, lock is implicit.
+ XXX: none flag is affected! */
+ switch (ot & 3) {
+ case 0:
+ gen_helper_xchgb(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ case 1:
+ gen_helper_xchgw(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ case 2:
+ gen_helper_xchgl(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+ break;
+ default:
+#ifdef TARGET_X86_64
+ case 3:
+ gen_helper_xchgq(cpu_A0, tcg_const_i32(reg),
+ tcg_const_i32(x86_64_hregs));
+#endif
+ }
+#else
gen_op_mov_TN_reg(ot, 0, reg);
/* for xchg, lock is implicit */
if (!(prefixes & PREFIX_LOCK))
gen_helper_lock();
gen_op_ld_T1_A0(ot + s->mem_index);
gen_op_st_T0_A0(ot + s->mem_index);
+#ifndef CONFIG_COREMU
if (!(prefixes & PREFIX_LOCK))
gen_helper_unlock();
+#endif
gen_op_mov_reg_T1(ot, reg);
+#endif
}
break;
case 0xc4: /* les Gv */
@@ -6530,6 +6747,28 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
}
bt_op:
tcg_gen_andi_tl(cpu_T[1], cpu_T[1], (1 << (3 + ot)) - 1);
+#ifdef CONFIG_COREMU
+ if (s->prefix & PREFIX_LOCK) {
+ if (s->cc_op != CC_OP_DYNAMIC)
+ gen_op_set_cc_op(s->cc_op);
+
+ switch (op) {
+ case 0:
+ goto illegal_op;
+ break;
+ case 1:
+ gen_helper_atomic_bts(cpu_A0, cpu_T[1], tcg_const_i32(ot));
+ break;
+ case 2:
+ gen_helper_atomic_btr(cpu_A0, cpu_T[1], tcg_const_i32(ot));
+ break;
+ case 3:
+ gen_helper_atomic_btc(cpu_A0, cpu_T[1], tcg_const_i32(ot));
+ }
+ s->cc_op = CC_OP_EFLAGS;
+ break;
+ }
+#endif
switch(op) {
case 0:
tcg_gen_shr_tl(cpu_cc_src, cpu_T[0], cpu_T[1]);
@@ -6669,6 +6908,11 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
goto illegal_op;
if (prefixes & PREFIX_REPZ) {
gen_svm_check_intercept(s, pc_start, SVM_EXIT_PAUSE);
+ /* When the emulated core number is more than the real number
+ on the machine, we need to catch the pause instruction to
+ avoid the lockholder thread to be preemted. */
+ if (!coremu_physical_core_enough_p())
+ gen_helper_pause();
}
break;
case 0x9b: /* fwait */
@@ -7647,12 +7891,16 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start)
goto illegal_op;
}
/* lock generation */
+#ifndef CONFIG_COREMU
if (s->prefix & PREFIX_LOCK)
gen_helper_unlock();
+#endif
return s->pc;
illegal_op:
+#ifndef CONFIG_COREMU
if (s->prefix & PREFIX_LOCK)
gen_helper_unlock();
+#endif
/* XXX: ensure that no lock was generated */
gen_exception(s, EXCP06_ILLOP, pc_start - s->cs_base);
return s->pc;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a99ecb9..b6e50d4 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -48,6 +48,7 @@
#include "cache-utils.h"
#include "host-utils.h"
#include "qemu-timer.h"
+#include "coremu-atomic.h"
/* Note: the long term plan is to reduce the dependancies on the QEMU
CPU definitions. Currently they are used for qemu_ld/st
@@ -59,6 +60,8 @@
#include "tcg-op.h"
#include "elf.h"
+#include "coremu-config.h"
+
#if defined(CONFIG_USE_GUEST_BASE) && !defined(TCG_TARGET_HAS_GUEST_BASE)
#error GUEST_BASE not supported on this host.
#endif
@@ -66,7 +69,7 @@
static void patch_reloc(uint8_t *code_ptr, int type,
tcg_target_long value, tcg_target_long addend);
-static TCGOpDef tcg_op_defs[] = {
+static COREMU_THREAD TCGOpDef tcg_op_defs[] = {
#define DEF(s, n, copy_size) { #s, 0, 0, n, n, 0, copy_size },
#define DEF2(s, oargs, iargs, cargs, flags) { #s, oargs, iargs, cargs, iargs + oargs + cargs, flags, 0 },
#include "tcg-opc.h"
@@ -74,12 +77,12 @@ static TCGOpDef tcg_op_defs[] = {
#undef DEF2
};
-static TCGRegSet tcg_target_available_regs[2];
-static TCGRegSet tcg_target_call_clobber_regs;
+static COREMU_THREAD TCGRegSet tcg_target_available_regs[2];
+static COREMU_THREAD TCGRegSet tcg_target_call_clobber_regs;
/* XXX: move that inside the context */
-uint16_t *gen_opc_ptr;
-TCGArg *gen_opparam_ptr;
+COREMU_THREAD uint16_t *gen_opc_ptr;
+COREMU_THREAD TCGArg *gen_opparam_ptr;
static inline void tcg_out8(TCGContext *s, uint8_t v)
{
@@ -242,9 +245,13 @@ void tcg_context_init(TCGContext *s)
tcg_target_init(s);
/* init global prologue and epilogue */
+#ifndef CONFIG_COREMU
+ /* For coremu, we only need one piece code prologue. We initialize it in the
+ * hardware thread. */
s->code_buf = code_gen_prologue;
s->code_ptr = s->code_buf;
tcg_target_qemu_prologue(s);
+#endif
flush_icache_range((unsigned long)s->code_buf,
(unsigned long)s->code_ptr);
}
@@ -2143,3 +2150,37 @@ void tcg_dump_info(FILE *f,
cpu_fprintf(f, "[TCG profiler not compiled]\n");
}
#endif
+
+#ifdef CONFIG_COREMU
+
+#include <sys/types.h>
+#include <sys/mman.h>
+#include "cm-init.h"
+void cm_code_prologue_init(void)
+{
+ TCGContext tmp_ctx;
+ memset(&tmp_ctx, 0, sizeof(tmp_ctx));
+
+ /* init global prologue and epilogue */
+ tmp_ctx.code_buf = code_gen_prologue;
+ tmp_ctx.code_ptr = tmp_ctx.code_buf;
+ tcg_target_qemu_prologue(&tmp_ctx);
+}
+
+void cm_inject_invalidate_code(TranslationBlock *tb)
+{
+ uint16_t ret = atomic_compare_exchangew(&tb->has_invalidate, 0, 1);
+
+ if(ret == 1)
+ return;
+
+ TCGContext *s = &tcg_ctx;
+ s->code_buf = tb->tc_ptr;
+ s->code_ptr = tb->tc_ptr;
+
+ tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RAX, (long) tb + 3);
+ tcg_out8(s, 0xe9); /* jmp tb_ret_addr */
+ tcg_out32(s, tb_ret_addr - s->code_ptr - 4);
+}
+
+#endif /* CONFIG_COREMU */
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 44856e1..171b233 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -318,11 +318,11 @@ struct TCGContext {
#endif
};
-extern TCGContext tcg_ctx;
-extern uint16_t *gen_opc_ptr;
-extern TCGArg *gen_opparam_ptr;
-extern uint16_t gen_opc_buf[];
-extern TCGArg gen_opparam_buf[];
+extern COREMU_THREAD TCGContext tcg_ctx;
+extern COREMU_THREAD uint16_t *gen_opc_ptr;
+extern COREMU_THREAD TCGArg *gen_opparam_ptr;
+extern COREMU_THREAD uint16_t gen_opc_buf[];
+extern COREMU_THREAD TCGArg gen_opparam_buf[];
/* pool based memory allocation */
diff --git a/tcg/x86_64/tcg-target.c b/tcg/x86_64/tcg-target.c
index 3892f75..fdc8784 100644
--- a/tcg/x86_64/tcg-target.c
+++ b/tcg/x86_64/tcg-target.c
@@ -22,6 +22,8 @@
* THE SOFTWARE.
*/
+#include "coremu-config.h"
+
#ifndef NDEBUG
static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
"%rax",
diff --git a/translate-all.c b/translate-all.c
index 91cbbc4..d7dc4ea 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -31,15 +31,17 @@
#include "tcg.h"
#include "qemu-timer.h"
+#include "coremu-config.h"
+
/* code generation context */
-TCGContext tcg_ctx;
+COREMU_THREAD TCGContext tcg_ctx;
-uint16_t gen_opc_buf[OPC_BUF_SIZE];
-TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
+COREMU_THREAD uint16_t gen_opc_buf[OPC_BUF_SIZE];
+COREMU_THREAD TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
-target_ulong gen_opc_pc[OPC_BUF_SIZE];
-uint16_t gen_opc_icount[OPC_BUF_SIZE];
-uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
+COREMU_THREAD target_ulong gen_opc_pc[OPC_BUF_SIZE];
+COREMU_THREAD uint16_t gen_opc_icount[OPC_BUF_SIZE];
+COREMU_THREAD uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
/* XXX: suppress that */
unsigned long code_gen_max_block_size(void)
diff --git a/vl.c b/vl.c
index 85bcc84..68b2505 100644
--- a/vl.c
+++ b/vl.c
@@ -165,6 +165,18 @@ int main(int argc, char **argv)
//#define DEBUG_NET
//#define DEBUG_SLIRP
+#include "coremu-config.h"
+#include "coremu-init.h"
+#include "coremu-core.h"
+#include "coremu-intr.h"
+#include "coremu-thread.h"
+#include "coremu-debug.h"
+#include "cm-loop.h"
+#include "cm-intr.h"
+#include "cm-init.h"
+#include "cm-timer.h"
+
+//#include "cm-i386-intr.h"
#define DEFAULT_RAM_SIZE 128
@@ -1710,6 +1722,13 @@ int debug_requested;
int vmstop_requested;
static int exit_requested;
+#ifdef CONFIG_COREMU
+int test_reset_request(void)
+{
+ return reset_requested;
+}
+#endif
+
int qemu_shutdown_requested(void)
{
int r = shutdown_requested;
@@ -1874,8 +1893,12 @@ void main_loop_wait(int nonblocking)
if (nonblocking)
timeout = 0;
else {
+#ifdef CONFIG_COREMU
+ timeout = 1000;
+#else
timeout = qemu_calculate_timeout();
qemu_bh_update_timeout(&timeout);
+#endif
}
host_main_loop_wait(&timeout);
@@ -1959,8 +1982,28 @@ qemu_irq qemu_system_powerdown;
static void main_loop(void)
{
int r;
+#ifdef CONFIG_COREMU
+ /* 1. Not finish: need some initialization */
+
+ /* 2. register hook functions */
+ /* register the interrupt handler */
+ coremu_register_event_handler((void (*)(void*))cm_common_intr_handler);
+ /* register the event notifier */
+ coremu_register_event_notifier(cm_notify_event);
+ /* 3. register core alarm handler. */
+ struct sigaction act;
+ sigfillset(&act.sa_mask);
+ act.sa_flags = 0;
+ extern void cm_local_host_alarm_handler(int host_signum);
+ act.sa_handler = cm_local_host_alarm_handler;
+ sigaction(COREMU_CORE_ALARM, &act, NULL);
+
+ /* 4. Create cpu thread body*/
+ coremu_run_all_cores(cm_cpu_loop);
+#else
qemu_main_loop_start();
+#endif
for (;;) {
do {
@@ -1968,9 +2011,11 @@ static void main_loop(void)
#ifdef CONFIG_PROFILER
int64_t ti;
#endif
+#ifndef CONFIG_COREMU
#ifndef CONFIG_IOTHREAD
nonblocking = tcg_cpu_exec();
#endif
+#endif
#ifdef CONFIG_PROFILER
ti = profile_getclock();
#endif
@@ -1991,6 +2036,7 @@ static void main_loop(void)
} else
break;
}
+#ifndef CONFIG_COREMU
if (qemu_reset_requested()) {
pause_all_vcpus();
qemu_system_reset();
@@ -2000,6 +2046,18 @@ static void main_loop(void)
monitor_protocol_event(QEVENT_POWERDOWN, NULL);
qemu_irq_raise(qemu_system_powerdown);
}
+#else
+ if (reset_requested) {
+ coremu_wait_all_cores_pause();
+ qemu_system_reset();
+ reset_requested=0;
+ coremu_restart_all_cores();
+ }
+ if (qemu_powerdown_requested()) {
+ monitor_protocol_event(QEVENT_POWERDOWN, NULL);
+ exit(0);
+ }
+#endif
if ((r = qemu_vmstop_requested())) {
vm_stop(r);
}
@@ -3456,6 +3514,16 @@ int main(int argc, char **argv, char **envp)
exit(1);
}
+#ifdef CONFIG_COREMU
+ cm_print("\n%s\n%s\n%s",
+ "------------------------------------",
+ "| [COREMU Parallel Emulator] |",
+ "------------------------------------");
+ coremu_init(smp_cpus);
+ /* Initialize qemu variable for coremu */
+ cm_init_pit_freq();
+#endif
+
qemu_opts_foreach(&qemu_device_opts, default_driver_check, NULL, 0);
qemu_opts_foreach(&qemu_global_opts, default_driver_check, NULL, 0);
@@ -3632,7 +3700,11 @@ int main(int argc, char **argv, char **envp)
ram_size = DEFAULT_RAM_SIZE * 1024 * 1024;
/* init the dynamic translator */
+#ifdef CONFIG_COREMU
+ cm_cpu_exec_init();
+#else
cpu_exec_init_all(tb_size * 1024 * 1024);
+#endif
bdrv_init_with_whitelist();
@@ -3917,3 +3989,10 @@ int main(int argc, char **argv, char **envp)
return 0;
}
+
+#ifdef CONFIG_COREMU
+int cm_vm_can_run(void)
+{
+ return vm_can_run();
+}
+#endif
[-- Attachment #3: Type: text/plain, Size: 31 bytes --]
--
Best regards,
Chen Yufei
^ permalink raw reply related [flat|nested] 20+ messages in thread