From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:56716) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCRaj-0000Wt-4u for qemu-devel@nongnu.org; Tue, 27 Mar 2012 04:17:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SCRae-0004Gu-1R for qemu-devel@nongnu.org; Tue, 27 Mar 2012 04:17:20 -0400 Received: from eu1sys200aog107.obsmtp.com ([207.126.144.123]:53888) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCRad-0004ES-Ol for qemu-devel@nongnu.org; Tue, 27 Mar 2012 04:17:15 -0400 Date: Tue, 27 Mar 2012 10:15:27 +0200 From: Message-ID: <20120327081526.GA2591@gnx2503> References: <1332767238-22730-1-git-send-email-cedric.vincent@st.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Subject: Re: [Qemu-devel] [PATCH] sh4-linux-user: fix multi-threading regression. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: Riku Voipio , qemu-devel , Aurelien Jarno On Mon, Mar 26, 2012 at 07:23:58PM +0200, Peter Maydell wrote: > 2012/3/26 Cédric VINCENT : > > This reverts commit fd4bab10 "target-sh4: optimize exceptions": > > [cc'ing Aurelien as the author of that commit] > > > the function cpu_restore_state() isn't expected to be called in user-mode, > > Is this really true? host_signal_handler() calls cpu_signal_handler() > calls handle_cpu_signala) calls cpu_restore_state() so hopefully > it's OK to be called in at least *some* situations... > > > as a consequence it isn't protected from race conditions.  For > > information, syscalls are exceptions on Linux/SH4. > > > > There were two possible fixes: either "tb_lock" is acquired/released > > around the call to cpu_restore_state() [1] or the commit that > > introduced this regression is reverted [2]. > > Can you explain a bit further what the race condition is that occurs > here? Here is what I observed on a "real" application: thread2: "jump to a new block" | thread1: "do a syscall" cpu_exec [tb_lock acquired] | helper_trapa tb_find_fast | raise_exception tb_find_slow | cpu_restore_state_from_retaddr tb_gen_code | cpu_translate_state cpu_gen_code | gen_intermediate_code_pc gen_intermediate_code_pc | /!\ thread1 and thread2 modify | gen_opc_buf[] at the same time | since thread1 didn't acquire tb_lock. Actually you can reproduce easily this race-condition with the source code below, where both threads do syscalls repeatedly: 8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<---- #include #include #include void *routine(void *arg) { int thread_number = (int)arg; while (1) { printf("hello, I'm %d\n", thread_number); sleep(1); } } int main(void) { int status; pthread_t thread; status = pthread_create(&thread, NULL, routine, (void *)2); if (status) { puts("can't create a thread, bye!"); return 1; } routine((void *)1); return 0; /* Never reached. */ } 8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<---- Before the fix: $ qemu-sh4 ./a.out hello, I'm 1 hello, I'm 2 hello, I'm 1 hello, I'm 2 hello, I'm 1 qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault $ qemu-sh4 ./a.out qemu/tcg/tcg.c:1369: tcg fatal error qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault