From: Kevin Wolf <kwolf@redhat.com>
To: Pavel Dovgalyuk <dovgaluk@ispras.ru>
Cc: 'Pavel Dovgalyuk' <Pavel.Dovgaluk@ispras.ru>,
qemu-devel@nongnu.org, peter.maydell@linaro.org,
war2jordan@live.com, crosthwaite.peter@gmail.com,
boost.lists@gmail.com, artem.k.pisarenko@gmail.com,
quintela@redhat.com, ciro.santilli@gmail.com,
jasowang@redhat.com, mst@redhat.com, armbru@redhat.com,
mreitz@redhat.com, maria.klimushenkova@ispras.ru,
kraxel@redhat.com, thomas.dullien@googlemail.com,
pbonzini@redhat.com, alex.bennee@linaro.org, dgilbert@redhat.com,
rth@twiddle.net
Subject: Re: [Qemu-devel] [PATCH v9 19/21] replay: add BH oneshot event for block layer
Date: Mon, 14 Jan 2019 12:35:20 +0100 [thread overview]
Message-ID: <20190114113520.GC6837@linux.fritz.box> (raw)
In-Reply-To: <001001d4abf9$b6ea93e0$24bfbba0$@ru>
Am 14.01.2019 um 12:10 hat Pavel Dovgalyuk geschrieben:
> > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > Am 09.01.2019 um 13:13 hat Pavel Dovgalyuk geschrieben:
> > > Replay is capable of recording normal BH events, but sometimes
> > > there are single use callbacks scheduled with aio_bh_schedule_oneshot
> > > function. This patch enables recording and replaying such callbacks.
> > > Block layer uses these events for calling the completion function.
> > > Replaying these calls makes the execution deterministic.
> > >
> > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
> >
> > This still doesn't come even close to catching all BHs that need to be
> > caught. While you managed to show a few BHs that actually don't need to
> > be considered for recording when I asked for this in v7, most BHs in the
> > block layer can in some way lead to device callbacks and must therefore
> > be recorded.
>
> Let's have a brief review. I can change all the places, but how
> should I make a test case to be sure, that all of them are working ok?
The list is changing all the time. This is why I am so concerned about
special-casing a few callers instead of having a generic solution. I
don't know how we could make sure that we call the right function
everywhere.
> aio_bh_schedule_oneshot is used in:
> - blk_abort_aio_request
> - bdrv_co_yield_to_drain
> - iscsi_co_generic_cb
> - nfs_co_generic_cb
> - null_aio_common
> - nvme_process_completion
> - nvme_rw_cb
> - rbd_finish_aiocb
> - vxhs_iio_callback
> - (and couple of others not in the block layer)
In addition to these, we have at least a few functions that just resume
block layer coroutines rather than directly scheduling a BH with a
callback somewhere in the block layer.
> We must change this call to replay_bh_schedule_oneshot_event when
> the result of the BH execution affects the replayed guest state
> (e.g., interrupt request is generated or memory is written)
>
> If you think that all of these can do that, then I should change
> such function calls.
I haven't reviewed the code, but these names look like all of them can
eventually call back into the guest devices. They won't do that always,
but potentially.
> > How bad would it be to record some BHs even if recording them isn't
> > necessary? I'd definitely try to err on the safe side here. Having two
> > different sets of BH functions, you can't expect that people always use
> > the right one (especially if you don't even make the existing code base
> > consistently use the right one intially).
>
> There are two possible options:
> 1. Execution hangs when recording. Kind of deadlock caused by the incorrect
> management of the events. E.g., adding stopping the VM and trying to flush
> the block layer queue.
> 2. Execution hangs when replaying.
> One of the events that affect the guest state is missed or generated
> at the other moment (e.g., when BH is not linked to the execution step).
> Then the guest behaves differently and the order of the events in the log
> does not match the guest state (e.g., interrupt processing is not matched).
So basically when you have two events that are kind of nested? Operation
A triggers event A, but in order to complete the operation, you call
operation B with event B internally, which isn't available yet because
we're still handling event A?
Could this be solved by not having an order of events, but an order of
sets of events?
Kevin
next prev parent reply other threads:[~2019-01-14 11:35 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-09 12:11 [Qemu-devel] [PATCH v9 00/21] Fixing record/replay and adding reverse debugging Pavel Dovgalyuk
2019-01-09 12:11 ` [Qemu-devel] [PATCH v9 01/21] replay: add missing fix for internal function Pavel Dovgalyuk
2019-01-09 12:11 ` [Qemu-devel] [PATCH v9 02/21] block: implement bdrv_snapshot_goto for blkreplay Pavel Dovgalyuk
2019-01-09 12:11 ` [Qemu-devel] [PATCH v9 03/21] replay: disable default snapshot for record/replay Pavel Dovgalyuk
2019-01-09 12:11 ` [Qemu-devel] [PATCH v9 04/21] replay: update docs for record/replay with block devices Pavel Dovgalyuk
2019-01-09 12:11 ` [Qemu-devel] [PATCH v9 05/21] replay: don't drain/flush bdrv queue while RR is working Pavel Dovgalyuk
2019-01-09 12:11 ` [Qemu-devel] [PATCH v9 06/21] replay: finish record/replay before closing the disks Pavel Dovgalyuk
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 07/21] qcow2: introduce icount field for snapshots Pavel Dovgalyuk
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 08/21] migration: " Pavel Dovgalyuk
2019-01-11 8:01 ` Markus Armbruster
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 09/21] replay: provide and accessor for rr filename Pavel Dovgalyuk
2019-01-09 16:27 ` Markus Armbruster
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 10/21] qapi: introduce replay.json for record/replay-related stuff Pavel Dovgalyuk
2019-01-10 10:34 ` Markus Armbruster
2019-01-14 8:31 ` Pavel Dovgalyuk
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 11/21] replay: introduce info hmp/qmp command Pavel Dovgalyuk
2019-01-11 8:27 ` Markus Armbruster
2019-01-14 9:01 ` Pavel Dovgalyuk
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 12/21] replay: introduce breakpoint at the specified step Pavel Dovgalyuk
2019-01-11 8:38 ` Markus Armbruster
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 13/21] replay: implement replay-seek command to proceed to the desired step Pavel Dovgalyuk
2019-01-11 8:58 ` Markus Armbruster
2019-01-14 9:36 ` Pavel Dovgalyuk
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 14/21] replay: refine replay-time module Pavel Dovgalyuk
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 15/21] replay: flush rr queue before loading the vmstate Pavel Dovgalyuk
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 16/21] gdbstub: add reverse step support in replay mode Pavel Dovgalyuk
2019-01-09 12:12 ` [Qemu-devel] [PATCH v9 17/21] gdbstub: add reverse continue " Pavel Dovgalyuk
2019-01-09 12:13 ` [Qemu-devel] [PATCH v9 18/21] replay: describe reverse debugging in docs/replay.txt Pavel Dovgalyuk
2019-01-09 12:13 ` [Qemu-devel] [PATCH v9 19/21] replay: add BH oneshot event for block layer Pavel Dovgalyuk
2019-01-11 10:49 ` Kevin Wolf
2019-01-14 11:10 ` Pavel Dovgalyuk
2019-01-14 11:35 ` Kevin Wolf [this message]
2019-01-14 11:48 ` Pavel Dovgalyuk
2019-02-13 5:47 ` Pavel Dovgalyuk
2019-01-09 12:13 ` [Qemu-devel] [PATCH v9 20/21] replay: init rtc after enabling the replay Pavel Dovgalyuk
2019-01-09 12:13 ` [Qemu-devel] [PATCH v9 21/21] replay: document development rules Pavel Dovgalyuk
2019-01-13 14:24 ` [Qemu-devel] [PATCH v9 00/21] Fixing record/replay and adding reverse debugging no-reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190114113520.GC6837@linux.fritz.box \
--to=kwolf@redhat.com \
--cc=Pavel.Dovgaluk@ispras.ru \
--cc=alex.bennee@linaro.org \
--cc=armbru@redhat.com \
--cc=artem.k.pisarenko@gmail.com \
--cc=boost.lists@gmail.com \
--cc=ciro.santilli@gmail.com \
--cc=crosthwaite.peter@gmail.com \
--cc=dgilbert@redhat.com \
--cc=dovgaluk@ispras.ru \
--cc=jasowang@redhat.com \
--cc=kraxel@redhat.com \
--cc=maria.klimushenkova@ispras.ru \
--cc=mreitz@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=rth@twiddle.net \
--cc=thomas.dullien@googlemail.com \
--cc=war2jordan@live.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).