From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:55378) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SF2sB-0007Nr-1r for qemu-devel@nongnu.org; Tue, 03 Apr 2012 08:30:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SF2s8-0007Vx-8i for qemu-devel@nongnu.org; Tue, 03 Apr 2012 08:30:06 -0400 Received: from eu1sys200aog101.obsmtp.com ([207.126.144.111]:59426) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SF2s7-0007SJ-W8 for qemu-devel@nongnu.org; Tue, 03 Apr 2012 08:30:04 -0400 Date: Tue, 3 Apr 2012 14:28:00 +0200 From: Message-ID: <20120403122800.GA10742@gnx2503> References: <1332767238-22730-1-git-send-email-cedric.vincent@st.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Subject: [Qemu-devel] linux-user: threading issue in the signal handler (Was: [PATCH] sh4-linux-user: fix multi-threading) regression. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell , Riku Voipio Cc: qemu-devel , Aurelien Jarno On Mon, Mar 26, 2012 at 07:23:58PM +0200, Peter Maydell wrote: > 2012/3/26 Cédric VINCENT : > > the function cpu_restore_state() isn't expected to be called in user-mode, > > Is this really true? host_signal_handler() calls cpu_signal_handler() > calls handle_cpu_signala) calls cpu_restore_state() so hopefully > it's OK to be called in at least *some* situations... Actually it's not that OK, the code below [1] exposes a threading issue in this specific part of QEMU: * the first thread makes invalid memory references, that way QEMU reaches the given situation (host_signal_handler -> cpu_signal_handler -> handle_cpu_signal -> cpu_restore_state -> gen_intermediate_code) without acquiring "tb_lock". * at the same time, the second thread executes code that wasn't translated before, that way the TCG is invoked with "tb_lock" acquired. Note that in this example I used a self modifying block just to simulate a not-yet-translated code. This example triggers an invalid memory reference and/or an assertion in the TCG part of QEMU. > NB the whole tb_lock thing is broken anyway. Because of such bugs or are there other reasons? Maybe we could add a simple sanity check in gen_intermediate_code() functions to be sure that "tb_lock" is acquired when they are called. Regards, Cédric. [1] I wrote this example for ARM and SH4, but this problem appears for any guest CPU: #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include pid_t gettid(void) { return syscall(SYS_gettid); } void handler(int signo) { printf("Thread %d received signal %d\n", gettid(), signo); } void *routine(void *arg) { pid_t tid = (pid_t)arg; while (1) { *(int *)NULL = 1; sleep(1); } } int main(void) { struct sigaction action; pthread_t thread; int status; bzero(&action, sizeof(action)); action.sa_handler = handler; sigfillset(&action.sa_mask); status = sigaction(SIGSEGV, &action, NULL); if (status < 0) { puts("can't regsiter sighandler, bye!"); exit(EXIT_FAILURE); } status = pthread_create(&thread, NULL, routine, (void *)gettid()); if (status) { puts("can't create a thread, bye!"); exit(EXIT_FAILURE); } #if defined(__SH4__) uint8_t self_modifying_code[] = { // _start 0x00, 0xc7, // mova @(4, pc), r0 0x01, 0x61, // mov.w @r0, r1 0x11, 0x20, // mov.w r1, @r0 0xfb, 0xaf, // bra _start 0x09, 0x00, // nop }; asm volatile ("jmp @%0 \n nop" : : "r" (self_modifying_code)); #elif defined(__arm__) uint32_t self_modifying_code[] = { // _start: 0xe1a0000f, // mov r0, pc 0xe5901000, // ldr r1, [r0] 0xe5801000, // str r1, [r0] 0xeafffffb, // b _start }; asm volatile ("mov pc, %0" : : "r" (self_modifying_code)); #else #error "unsupported architecture." #endif puts("should'nt be reached, bye!"); exit(EXIT_FAILURE); }