From: Kevin Wolf <kwolf@redhat.com>
To: Pavel Dovgalyuk <dovgaluk@ispras.ru>
Cc: 'Pavel Dovgalyuk' <Pavel.Dovgaluk@ispras.ru>,
qemu-devel@nongnu.org, peter.maydell@linaro.org,
war2jordan@live.com, crosthwaite.peter@gmail.com,
boost.lists@gmail.com, artem.k.pisarenko@gmail.com,
quintela@redhat.com, ciro.santilli@gmail.com,
jasowang@redhat.com, mst@redhat.com, armbru@redhat.com,
mreitz@redhat.com, maria.klimushenkova@ispras.ru,
kraxel@redhat.com, thomas.dullien@googlemail.com,
pbonzini@redhat.com, alex.bennee@linaro.org, dgilbert@redhat.com,
rth@twiddle.net
Subject: Re: [Qemu-devel] [PATCH v13 19/25] replay: add BH oneshot event for block layer
Date: Tue, 5 Mar 2019 10:52:38 +0100 [thread overview]
Message-ID: <20190305095238.GB5280@dhcp-200-226.str.redhat.com> (raw)
In-Reply-To: <002b01d4d284$3109aa20$931cfe60$@ru>
Am 04.03.2019 um 13:17 hat Pavel Dovgalyuk geschrieben:
> > From: Kevin Wolf [mailto:kwolf@redhat.com]
> > Am 21.02.2019 um 12:05 hat Pavel Dovgalyuk geschrieben:
> > > Replay is capable of recording normal BH events, but sometimes
> > > there are single use callbacks scheduled with aio_bh_schedule_oneshot
> > > function. This patch enables recording and replaying such callbacks.
> > > Block layer uses these events for calling the completion function.
> > > Replaying these calls makes the execution deterministic.
> > >
> > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
> > >
> > > --
> > >
> > > v6:
> > > - moved stub function to the separate file for fixing linux-user build
> > > v10:
> > > - replaced all block layer aio_bh_schedule_oneshot calls
> > This still doesn't catch all instances, e.g. everything that goes
> > through aio_co_schedule() is missing.
>
> It seems, that everything else is synchronized with blkreplay driver
> which is mandatory when using block devices in rr mode.
Ah, yes, this is a good point. blkreplay goes through
replay_block_event(), which is where things get synchronised, right?
Does this mean that most of the places where you replaced a BH with your
new function don't actually need it either because they are called
through blkreplay and will go through replay_block_event() before
reaching the guest?
> > But I fully expect this to get broken anyway all the time because nobody
> > understands which function to use, and if it works for your special case
> > now and we'll fix other stuff as you encouter it, maybe that's good
> > enough for you.
>
> This problem exists in every subsystem and it is ok for now, when
> record/replay is not mature enough, and not familiar for others. When
> virtual devices are updated, developers may miss correct loadvm/savevm
> implementation. For example, loading the audio device state may miss
> shift the phase of the output signal. Nobody will notice that bug in
> the migration process, but it reveals when we use record/replay.
>
> We can't cover everything with record/replay tests. Most of the new
> bugs can be revealed in complex configurations after billions of
> executed instructions. But when this feature will be available out of
> the box, we'll at least get more smoke testing.
Ok.
> > > @@ -1349,8 +1351,8 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int
> > bytes,
> > >
> > > acb->has_returned = true;
> > > if (acb->rwco.ret != NOT_DONE) {
> > > - aio_bh_schedule_oneshot(blk_get_aio_context(blk),
> > > - blk_aio_complete_bh, acb);
> > > + replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
> > > + blk_aio_complete_bh, acb);
> > > }
> >
> > This, and a few other places that you convert, are in fast paths and add
> > some calls that are unnecessary for non-replay cases.
>
> I don't think that this can make a noticeable slowdown, but we can run
> the tests if you want.
> We have the test suite which performs disk-intensive computation.
> It was created to measure the effect of running BH callbacks through
> the virtual timer infrastructure.
I think this requires quite fast storage to possibly make a difference.
Or if you don't have that, maybe a ramdisk or even a null-co:// backend
could do the trick. Maybe null-co:// is actually the best option.
Anyway, if it's not too much work for you, running some tests would be
good.
> > I wonder if we could make replay optional in ./configure and then make
> > replay_bh_schedule_oneshot_event() a static inline function that can get
> > optimised away at compile time if the feature is disabled.
>
> It is coupled with icount. However, some icount calls are also lie on
> the fast paths and are completely useless when icount is not enabled.
Well, the common fast path is KVM, which doesn't have icount at all, so
that might make it less critical. :-)
I get your point, though maybe that just means that both should be
possible to be disabled at configure time.
Kevin
next prev parent reply other threads:[~2019-03-05 9:52 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-21 11:04 [Qemu-devel] [PATCH v13 00/25] Fixing record/replay and adding reverse debugging Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 01/25] replay: add missing fix for internal function Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 02/25] block: implement bdrv_snapshot_goto for blkreplay Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 03/25] replay: disable default snapshot for record/replay Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 04/25] replay: update docs for record/replay with block devices Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 05/25] replay: don't drain/flush bdrv queue while RR is working Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 06/25] replay: finish record/replay before closing the disks Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 07/25] qcow2: introduce icount field for snapshots Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 08/25] migration: " Pavel Dovgalyuk
2019-02-21 11:04 ` [Qemu-devel] [PATCH v13 09/25] replay: provide an accessor for rr filename Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 10/25] qapi: introduce replay.json for record/replay-related stuff Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 11/25] replay: introduce info hmp/qmp command Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 12/25] replay: introduce breakpoint at the specified step Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 13/25] replay: implement replay-seek command Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 14/25] replay: refine replay-time module Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 15/25] replay: flush rr queue before loading the vmstate Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 16/25] gdbstub: add reverse step support in replay mode Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 17/25] gdbstub: add reverse continue " Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 18/25] replay: describe reverse debugging in docs/replay.txt Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 19/25] replay: add BH oneshot event for block layer Pavel Dovgalyuk
2019-03-04 10:33 ` Kevin Wolf
2019-03-04 12:17 ` Pavel Dovgalyuk
2019-03-05 9:52 ` Kevin Wolf [this message]
2019-03-05 11:04 ` Pavel Dovgalyuk
2019-03-05 11:12 ` Kevin Wolf
2019-03-06 8:33 ` Pavel Dovgalyuk
2019-03-06 9:09 ` Kevin Wolf
2019-03-06 9:18 ` Pavel Dovgalyuk
2019-03-06 9:29 ` Kevin Wolf
2019-03-06 9:37 ` Pavel Dovgalyuk
2019-03-06 10:33 ` Kevin Wolf
2019-03-06 10:57 ` Pavel Dovgalyuk
2019-03-06 14:00 ` Pavel Dovgalyuk
2019-02-21 11:05 ` [Qemu-devel] [PATCH v13 20/25] replay: init rtc after enabling the replay Pavel Dovgalyuk
2019-02-21 11:06 ` [Qemu-devel] [PATCH v13 21/25] replay: document development rules Pavel Dovgalyuk
2019-02-21 11:06 ` [Qemu-devel] [PATCH v13 22/25] util/qemu-timer: refactor deadline calculation for external timers Pavel Dovgalyuk
2019-02-21 11:06 ` [Qemu-devel] [PATCH v13 23/25] replay: fix replay shutdown Pavel Dovgalyuk
2019-02-21 11:06 ` [Qemu-devel] [PATCH v13 24/25] replay: rename step-related variables and functions Pavel Dovgalyuk
2019-02-21 11:06 ` [Qemu-devel] [PATCH v13 25/25] icount: clean up cpu_can_io before jumping to the next block Pavel Dovgalyuk
2019-03-04 7:46 ` [Qemu-devel] [PATCH v13 00/25] Fixing record/replay and adding reverse debugging Pavel Dovgalyuk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190305095238.GB5280@dhcp-200-226.str.redhat.com \
--to=kwolf@redhat.com \
--cc=Pavel.Dovgaluk@ispras.ru \
--cc=alex.bennee@linaro.org \
--cc=armbru@redhat.com \
--cc=artem.k.pisarenko@gmail.com \
--cc=boost.lists@gmail.com \
--cc=ciro.santilli@gmail.com \
--cc=crosthwaite.peter@gmail.com \
--cc=dgilbert@redhat.com \
--cc=dovgaluk@ispras.ru \
--cc=jasowang@redhat.com \
--cc=kraxel@redhat.com \
--cc=maria.klimushenkova@ispras.ru \
--cc=mreitz@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=rth@twiddle.net \
--cc=thomas.dullien@googlemail.com \
--cc=war2jordan@live.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).