qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Race with atexit functions in system emulation
@ 2020-07-01 11:05 Alex Bennée
  2020-07-01 11:14 ` Paolo Bonzini
  2020-07-02  6:18 ` Pavel Dovgalyuk
  0 siblings, 2 replies; 6+ messages in thread
From: Alex Bennée @ 2020-07-01 11:05 UTC (permalink / raw)
  To: qemu-devel, Pavel Dovgalyuk, Paolo Bonzini; +Cc: Peter Maydell

Hi,

While running some TSAN tests I ran into the following race condition:

  WARNING: ThreadSanitizer: data race (pid=1605)
    Write of size 4 at 0x55c437814d98 by thread T2 (mutexes: write M619):
      #0 replay_finish /home/alex.bennee/lsrc/qemu.git/replay/replay.c:393:17 (qemu-system-aarch64+0xc55116)
      #1 at_exit_wrapper() <null> (qemu-system-aarch64+0x368988)
      #2 handle_semihosting /home/alex.bennee/lsrc/qemu.git/target/arm/helper.c:9740:25 (qemu-system-aarch64+0x5e75b0)
      #3 arm_cpu_do_interrupt /home/alex.bennee/lsrc/qemu.git/target/arm/helper.c:9788:9 (qemu-system-aarch64+0x5e75b0)
      #4 cpu_handle_exception /home/alex.bennee/lsrc/qemu.git/accel/tcg/cpu-exec.c:504:13 (qemu-system-aarch64+0x4a4690)
      #5 cpu_exec /home/alex.bennee/lsrc/qemu.git/accel/tcg/cpu-exec.c:712:13 (qemu-system-aarch64+0x4a4690)
      #6 tcg_cpu_exec /home/alex.bennee/lsrc/qemu.git/cpus.c:1452:11 (qemu-system-aarch64+0x441157)
      #7 qemu_tcg_rr_cpu_thread_fn /home/alex.bennee/lsrc/qemu.git/cpus.c:1554:21 (qemu-system-aarch64+0x441157)
      #8 qemu_thread_start /home/alex.bennee/lsrc/qemu.git/util/qemu-thread-posix.c:521:9 (qemu-system-aarch64+0xe38bd0)

    Previous read of size 4 at 0x55c437814d98 by main thread:
      #0 replay_mutex_lock /home/alex.bennee/lsrc/qemu.git/replay/replay-internal.c:217:9 (qemu-system-aarch64+0xc55c03)
      #1 os_host_main_loop_wait /home/alex.bennee/lsrc/qemu.git/util/main-loop.c:239:5 (qemu-system-aarch64+0xe5af4f)
      #2 main_loop_wait /home/alex.bennee/lsrc/qemu.git/util/main-loop.c:518:11 (qemu-system-aarch64+0xe5af4f)
      #3 qemu_main_loop /home/alex.bennee/lsrc/qemu.git/softmmu/vl.c:1664:9 (qemu-system-aarch64+0x5ce806)
      #4 main /home/alex.bennee/lsrc/qemu.git/softmmu/main.c:49:5 (qemu-system-aarch64+0xdbf8b7)

    Location is global 'replay_mode' of size 4 at 0x55c437814d98 (qemu-system-aarch64+0x0000021a9d98)

Basically we have a clash between semihosting wanting to do an exit,
which is useful for reporting status and the fact that we have atexit()
handlers to clean up that clash with the main loop accessing the mutex
while we go. Ultimately I think this is harmless as we are shutting down
anyway but I was wondering how we would clean something like this up?

Should we maybe defer the exit to once the main loop has been exited
with a some sort of vmstop? Or could we have an atexit handler that
kills the main thread?

I should point out that linux-user has a fairly heavy preexit_cleanup
function to do this sort of tidying up. atexit() is also fairly heavily
used for other devices in system emulation.

Ideas?

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Race with atexit functions in system emulation
  2020-07-01 11:05 Race with atexit functions in system emulation Alex Bennée
@ 2020-07-01 11:14 ` Paolo Bonzini
  2020-07-02  6:18 ` Pavel Dovgalyuk
  1 sibling, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2020-07-01 11:14 UTC (permalink / raw)
  To: Alex Bennée, qemu-devel, Pavel Dovgalyuk; +Cc: Peter Maydell

On 01/07/20 13:05, Alex Bennée wrote:
> Should we maybe defer the exit to once the main loop has been exited
> with a some sort of vmstop? Or could we have an atexit handler that
> kills the main thread?

Yes, I think the way to do "exit" is to use
qemu_system_shutdown_request.  Possibly halt the CPU?  This way you can
also obey -no-shutdown and give the user an occasion to inspect the
program.  You can add a global to pass back the exit code.

Alternatively, replay_finish could just take the lock.

Paolo



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Race with atexit functions in system emulation
  2020-07-01 11:05 Race with atexit functions in system emulation Alex Bennée
  2020-07-01 11:14 ` Paolo Bonzini
@ 2020-07-02  6:18 ` Pavel Dovgalyuk
  2020-07-02  7:49   ` Alex Bennée
  2020-07-02  9:18   ` Paolo Bonzini
  1 sibling, 2 replies; 6+ messages in thread
From: Pavel Dovgalyuk @ 2020-07-02  6:18 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Paolo Bonzini, Pavel.Dovgaluk, qemu-devel, Peter Maydell

[-- Attachment #1: Type: text/plain, Size: 3339 bytes --]

Is it true, that semihosting can be used to access (read and write) host
files from the guest?
In such a case it can't be used with RR for the following reasons:
1. We don't preserve modified files, therefore the execution result may
change in the future runs.
2. Even in the case, when all the files are read only, semihosting FDs
can't be saved, therefore it may not be used with reverse debugging.

On Wed, Jul 1, 2020 at 2:06 PM Alex Bennée <alex.bennee@linaro.org> wrote:

> Hi,
>
> While running some TSAN tests I ran into the following race condition:
>
>   WARNING: ThreadSanitizer: data race (pid=1605)
>     Write of size 4 at 0x55c437814d98 by thread T2 (mutexes: write M619):
>       #0 replay_finish
> /home/alex.bennee/lsrc/qemu.git/replay/replay.c:393:17
> (qemu-system-aarch64+0xc55116)
>       #1 at_exit_wrapper() <null> (qemu-system-aarch64+0x368988)
>       #2 handle_semihosting
> /home/alex.bennee/lsrc/qemu.git/target/arm/helper.c:9740:25
> (qemu-system-aarch64+0x5e75b0)
>       #3 arm_cpu_do_interrupt
> /home/alex.bennee/lsrc/qemu.git/target/arm/helper.c:9788:9
> (qemu-system-aarch64+0x5e75b0)
>       #4 cpu_handle_exception
> /home/alex.bennee/lsrc/qemu.git/accel/tcg/cpu-exec.c:504:13
> (qemu-system-aarch64+0x4a4690)
>       #5 cpu_exec
> /home/alex.bennee/lsrc/qemu.git/accel/tcg/cpu-exec.c:712:13
> (qemu-system-aarch64+0x4a4690)
>       #6 tcg_cpu_exec /home/alex.bennee/lsrc/qemu.git/cpus.c:1452:11
> (qemu-system-aarch64+0x441157)
>       #7 qemu_tcg_rr_cpu_thread_fn
> /home/alex.bennee/lsrc/qemu.git/cpus.c:1554:21
> (qemu-system-aarch64+0x441157)
>       #8 qemu_thread_start
> /home/alex.bennee/lsrc/qemu.git/util/qemu-thread-posix.c:521:9
> (qemu-system-aarch64+0xe38bd0)
>
>     Previous read of size 4 at 0x55c437814d98 by main thread:
>       #0 replay_mutex_lock
> /home/alex.bennee/lsrc/qemu.git/replay/replay-internal.c:217:9
> (qemu-system-aarch64+0xc55c03)
>       #1 os_host_main_loop_wait
> /home/alex.bennee/lsrc/qemu.git/util/main-loop.c:239:5
> (qemu-system-aarch64+0xe5af4f)
>       #2 main_loop_wait
> /home/alex.bennee/lsrc/qemu.git/util/main-loop.c:518:11
> (qemu-system-aarch64+0xe5af4f)
>       #3 qemu_main_loop
> /home/alex.bennee/lsrc/qemu.git/softmmu/vl.c:1664:9
> (qemu-system-aarch64+0x5ce806)
>       #4 main /home/alex.bennee/lsrc/qemu.git/softmmu/main.c:49:5
> (qemu-system-aarch64+0xdbf8b7)
>
>     Location is global 'replay_mode' of size 4 at 0x55c437814d98
> (qemu-system-aarch64+0x0000021a9d98)
>
> Basically we have a clash between semihosting wanting to do an exit,
> which is useful for reporting status and the fact that we have atexit()
> handlers to clean up that clash with the main loop accessing the mutex
> while we go. Ultimately I think this is harmless as we are shutting down
> anyway but I was wondering how we would clean something like this up?
>
> Should we maybe defer the exit to once the main loop has been exited
> with a some sort of vmstop? Or could we have an atexit handler that
> kills the main thread?
>
> I should point out that linux-user has a fairly heavy preexit_cleanup
> function to do this sort of tidying up. atexit() is also fairly heavily
> used for other devices in system emulation.
>
> Ideas?
>
> --
> Alex Bennée
>


-- 
Pavel Dovgalyuk

[-- Attachment #2: Type: text/html, Size: 3827 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Race with atexit functions in system emulation
  2020-07-02  6:18 ` Pavel Dovgalyuk
@ 2020-07-02  7:49   ` Alex Bennée
  2020-07-02  7:54     ` Pavel Dovgalyuk
  2020-07-02  9:18   ` Paolo Bonzini
  1 sibling, 1 reply; 6+ messages in thread
From: Alex Bennée @ 2020-07-02  7:49 UTC (permalink / raw)
  To: Pavel Dovgalyuk; +Cc: Paolo Bonzini, Pavel.Dovgaluk, qemu-devel, Peter Maydell


Pavel Dovgalyuk <pavel.dovgaluk@gmail.com> writes:

> Is it true, that semihosting can be used to access (read and write) host
> files from the guest?

They can - but in these test cases we are only using semihosting for
console output and signalling an exit code at the end of the test. I
don't think that gets in the way of record/replay (aside from the exit
race described).

> In such a case it can't be used with RR for the following reasons:
> 1. We don't preserve modified files, therefore the execution result may
> change in the future runs.
> 2. Even in the case, when all the files are read only, semihosting FDs
> can't be saved, therefore it may not be used with reverse debugging.

This raises a wider question of what is the best way to indicate support
(or lack of support) for a particular device to a user? Do we need a
"replay aware" list or annotation?

>
> On Wed, Jul 1, 2020 at 2:06 PM Alex Bennée <alex.bennee@linaro.org> wrote:
>
>> Hi,
>>
>> While running some TSAN tests I ran into the following race condition:
>>
>>   WARNING: ThreadSanitizer: data race (pid=1605)
>>     Write of size 4 at 0x55c437814d98 by thread T2 (mutexes: write M619):
>>       #0 replay_finish
>> /home/alex.bennee/lsrc/qemu.git/replay/replay.c:393:17
>> (qemu-system-aarch64+0xc55116)
>>       #1 at_exit_wrapper() <null> (qemu-system-aarch64+0x368988)
>>       #2 handle_semihosting
>> /home/alex.bennee/lsrc/qemu.git/target/arm/helper.c:9740:25
>> (qemu-system-aarch64+0x5e75b0)
>>       #3 arm_cpu_do_interrupt
>> /home/alex.bennee/lsrc/qemu.git/target/arm/helper.c:9788:9
>> (qemu-system-aarch64+0x5e75b0)
>>       #4 cpu_handle_exception
>> /home/alex.bennee/lsrc/qemu.git/accel/tcg/cpu-exec.c:504:13
>> (qemu-system-aarch64+0x4a4690)
>>       #5 cpu_exec
>> /home/alex.bennee/lsrc/qemu.git/accel/tcg/cpu-exec.c:712:13
>> (qemu-system-aarch64+0x4a4690)
>>       #6 tcg_cpu_exec /home/alex.bennee/lsrc/qemu.git/cpus.c:1452:11
>> (qemu-system-aarch64+0x441157)
>>       #7 qemu_tcg_rr_cpu_thread_fn
>> /home/alex.bennee/lsrc/qemu.git/cpus.c:1554:21
>> (qemu-system-aarch64+0x441157)
>>       #8 qemu_thread_start
>> /home/alex.bennee/lsrc/qemu.git/util/qemu-thread-posix.c:521:9
>> (qemu-system-aarch64+0xe38bd0)
>>
>>     Previous read of size 4 at 0x55c437814d98 by main thread:
>>       #0 replay_mutex_lock
>> /home/alex.bennee/lsrc/qemu.git/replay/replay-internal.c:217:9
>> (qemu-system-aarch64+0xc55c03)
>>       #1 os_host_main_loop_wait
>> /home/alex.bennee/lsrc/qemu.git/util/main-loop.c:239:5
>> (qemu-system-aarch64+0xe5af4f)
>>       #2 main_loop_wait
>> /home/alex.bennee/lsrc/qemu.git/util/main-loop.c:518:11
>> (qemu-system-aarch64+0xe5af4f)
>>       #3 qemu_main_loop
>> /home/alex.bennee/lsrc/qemu.git/softmmu/vl.c:1664:9
>> (qemu-system-aarch64+0x5ce806)
>>       #4 main /home/alex.bennee/lsrc/qemu.git/softmmu/main.c:49:5
>> (qemu-system-aarch64+0xdbf8b7)
>>
>>     Location is global 'replay_mode' of size 4 at 0x55c437814d98
>> (qemu-system-aarch64+0x0000021a9d98)
>>
>> Basically we have a clash between semihosting wanting to do an exit,
>> which is useful for reporting status and the fact that we have atexit()
>> handlers to clean up that clash with the main loop accessing the mutex
>> while we go. Ultimately I think this is harmless as we are shutting down
>> anyway but I was wondering how we would clean something like this up?
>>
>> Should we maybe defer the exit to once the main loop has been exited
>> with a some sort of vmstop? Or could we have an atexit handler that
>> kills the main thread?
>>
>> I should point out that linux-user has a fairly heavy preexit_cleanup
>> function to do this sort of tidying up. atexit() is also fairly heavily
>> used for other devices in system emulation.
>>
>> Ideas?
>>
>> --
>> Alex Bennée
>>


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Race with atexit functions in system emulation
  2020-07-02  7:49   ` Alex Bennée
@ 2020-07-02  7:54     ` Pavel Dovgalyuk
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Dovgalyuk @ 2020-07-02  7:54 UTC (permalink / raw)
  To: Alex Bennée, Pavel Dovgalyuk
  Cc: Paolo Bonzini, qemu-devel, Peter Maydell

On 02.07.2020 10:49, Alex Bennée wrote:
> 
> Pavel Dovgalyuk <pavel.dovgaluk@gmail.com> writes:
> 
>> Is it true, that semihosting can be used to access (read and write) host
>> files from the guest?
> 
> They can - but in these test cases we are only using semihosting for
> console output and signalling an exit code at the end of the test. I
> don't think that gets in the way of record/replay (aside from the exit
> race described).

Ok.

>> In such a case it can't be used with RR for the following reasons:
>> 1. We don't preserve modified files, therefore the execution result may
>> change in the future runs.
>> 2. Even in the case, when all the files are read only, semihosting FDs
>> can't be saved, therefore it may not be used with reverse debugging.
> 
> This raises a wider question of what is the best way to indicate support
> (or lack of support) for a particular device to a user? Do we need a
> "replay aware" list or annotation?

There is a replay blocker, which is set at the command-line parsing phase.
E.g., user gets an error when tries to enable -smp >1.


Pavel Dovgalyuk


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Race with atexit functions in system emulation
  2020-07-02  6:18 ` Pavel Dovgalyuk
  2020-07-02  7:49   ` Alex Bennée
@ 2020-07-02  9:18   ` Paolo Bonzini
  1 sibling, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2020-07-02  9:18 UTC (permalink / raw)
  To: Pavel Dovgalyuk, Alex Bennée
  Cc: Peter Maydell, Pavel.Dovgaluk, qemu-devel

On 02/07/20 08:18, Pavel Dovgalyuk wrote:
> Is it true, that semihosting can be used to access (read and write) host
> files from the guest?
> In such a case it can't be used with RR for the following reasons:
> 1. We don't preserve modified files, therefore the execution result may
> change in the future runs.
> 2. Even in the case, when all the files are read only, semihosting FDs
> can't be saved, therefore it may not be used with reverse debugging.

Since there are many cases in which semihosting and RR can work together
well, I think the fix here should not be to just block semihosting.  The
replay_finish issue is just an example, there can be more atexit handlers.

Paolo



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-02  9:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-07-01 11:05 Race with atexit functions in system emulation Alex Bennée
2020-07-01 11:14 ` Paolo Bonzini
2020-07-02  6:18 ` Pavel Dovgalyuk
2020-07-02  7:49   ` Alex Bennée
2020-07-02  7:54     ` Pavel Dovgalyuk
2020-07-02  9:18   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).