All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] docs/devel: add some notes on tcg-icount for developers
@ 2020-06-19 13:58 Alex Bennée
  2020-06-19 14:50 ` no-reply
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Alex Bennée @ 2020-06-19 13:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Paolo Bonzini, Richard Henderson, Alex Bennée,
	Pavel Dovgalyuk

This attempts to bring together my understanding of the requirements
for icount behaviour into one reference document for our developer
notes. It currently make one piece of conjecture which I think is true
that we don't need gen_io_start/end statements for non-MMIO related
I/O operations.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Dovgalyuk <dovgaluk@ispras.ru>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
---
 docs/devel/tcg-icount.rst | 86 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)
 create mode 100644 docs/devel/tcg-icount.rst

diff --git a/docs/devel/tcg-icount.rst b/docs/devel/tcg-icount.rst
new file mode 100644
index 00000000000..53d08ce9282
--- /dev/null
+++ b/docs/devel/tcg-icount.rst
@@ -0,0 +1,86 @@
+..
+   Copyright (c) 2019, Linaro Limited
+   Written by Alex Bennée
+
+
+========================
+TCG Instruction Counting
+========================
+
+TCG has long supported a feature known as icount which allows for
+instruction counting during execution. This should be confused with
+cycle accurate emulation - QEMU does not attempt to emulate how long
+an instruction would take on real hardware. That is a job for other
+more detailed (and slower) tools that simulate the rest of a
+micro-architecture.
+
+This feature is only available for system emulation and is
+incompatible with multi-threaded TCG. It can be used to better align
+execution time with wall-clock time so a "slow" device doesn't run too
+fast on modern hardware. It can also provides for a degree of
+deterministic execution and is an essential part of the record/replay
+support in QEMU.
+
+Core Concepts
+=============
+
+At it's heart icount is simply a count of executed instructions which
+is stored in the TimersState of QEMU's timer sub-system. The number of
+executed instructions can then be used to calculate QEMU_CLOCK_VIRTUAL
+which represents the amount of elapsed time in the system since
+execution started. Depending on the icount mode this may either be a
+fixed number of ns per instructions or adjusted as execution continues
+to keep real time and virtual time in sync.
+
+To be able to calculate the number of executed instructions the
+translator starts by allocating a budget of instructions to be
+executed. The budget of instructions is limited by how long it will be
+until the next timer will expire. We store this budget as part of a
+CPUs icount_decr field which shared with the machinery for handling
+cpu_exit(). The whole field is checked at the start of every
+translated block and will cause us to return to the outer loop to deal
+with whatever caused the exit.
+
+In the case of icount before the flag is checked we subtract the
+number of instructions the translation block would execute. If this
+would cause the instruction budget to got negative we exit the main
+loop and regenerate a new translation block with exactly the right
+number of instructions to take the budget to 0 meaning whatever timer
+was due to expire will expire exactly when we exit the main run loop.
+
+Dealing with MMIO
+-----------------
+
+While we can adjust the instruction budget for known events like timer
+expiry we can not do the same for MMIO. Every load/store we execute
+might potentially trigger an I/O event at which point we will need an
+up to date and accurate reading of the icount number.
+
+To deal with this case when an I/O access is made we:
+
+  - restore un-executed instructions to the icount budget
+  - re-compile a single [1]_ instruction block for the current PC
+  - exit the cpu loop and execute the re-compiled block
+
+The new block is created with the CF_LAST_IO compile flag which
+ensures the final instruction is wrapped with a
+gen_io_start()/gen_io_end() pair so we don't enter a perpetual loop
+constantly recompiling a single instruction block. For translators
+using the common translator_loop this is done automatically.
+  
+.. [1] sometimes two instructions if dealing with delay slots  
+
+Other I/O operations
+--------------------
+
+MMIO isn't the only type of operation for which we might need a
+correct and accurate clock. IO port instructions and accesses to
+system registers are the common examples here. For the clock to be
+accurate you end a translation block on these instructions.
+
+.. warning:: (CONJECTURE) instructions that won't get trapped in the
+             io_read/writex shouldn't need gen_io_start/end blocks
+             around them.
+
+
+
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-19 15:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-19 13:58 [RFC PATCH] docs/devel: add some notes on tcg-icount for developers Alex Bennée
2020-06-19 14:50 ` no-reply
2020-06-19 14:51 ` no-reply
2020-06-19 14:54 ` Peter Maydell
2020-06-19 15:44   ` Paolo Bonzini
2020-06-19 15:54   ` Alex Bennée
2020-06-19 14:55 ` no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.