qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Cody <jcody@redhat.com>
To: John Snow <jsnow@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	vsementsov@virtuozzo.com, qemu-block@nongnu.org,
	qemu-devel@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 4/6] blockjob: add block_job_start
Date: Mon, 7 Nov 2016 21:05:28 -0500	[thread overview]
Message-ID: <20161108020528.GC8956@localhost.localdomain> (raw)
In-Reply-To: <f5cd95d3-33a5-2451-1bf5-f9086cdb9164@redhat.com>

On Mon, Nov 07, 2016 at 09:02:14PM -0500, John Snow wrote:
> 
> 
> On 11/03/2016 08:17 AM, Kevin Wolf wrote:
> >Am 02.11.2016 um 18:50 hat John Snow geschrieben:
> >>Instead of automatically starting jobs at creation time via backup_start
> >>et al, we'd like to return a job object pointer that can be started
> >>manually at later point in time.
> >>
> >>For now, add the block_job_start mechanism and start the jobs
> >>automatically as we have been doing, with conversions job-by-job coming
> >>in later patches.
> >>
> >>Of note: cancellation of unstarted jobs will perform all the normal
> >>cleanup as if the job had started, particularly abort and clean. The
> >>only difference is that we will not emit any events, because the job
> >>never actually started.
> >>
> >>Signed-off-by: John Snow <jsnow@redhat.com>
> >
> >>diff --git a/block/commit.c b/block/commit.c
> >>index 20d27e2..5b7c454 100644
> >>--- a/block/commit.c
> >>+++ b/block/commit.c
> >>@@ -289,10 +289,9 @@ void commit_start(const char *job_id, BlockDriverState *bs,
> >>     s->backing_file_str = g_strdup(backing_file_str);
> >>
> >>     s->on_error = on_error;
> >>-    s->common.co = qemu_coroutine_create(s->common.driver->start, s);
> >>
> >>     trace_commit_start(bs, base, top, s, s->common.co);
> >
> >s->common.co is now uninitialised and should probably be removed from
> >the tracepoint arguments. The same is true for mirror and stream.
> >
> >>-    qemu_coroutine_enter(s->common.co);
> >>+    block_job_start(&s->common);
> >> }
> >
> >>diff --git a/blockjob.c b/blockjob.c
> >>index e3c458c..16c5159 100644
> >>--- a/blockjob.c
> >>+++ b/blockjob.c
> >>@@ -174,7 +174,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
> >>     job->blk           = blk;
> >>     job->cb            = cb;
> >>     job->opaque        = opaque;
> >>-    job->busy          = true;
> >>+    job->busy          = false;
> >>+    job->paused        = true;
> >>     job->refcnt        = 1;
> >>     bs->job = job;
> >>
> >>@@ -202,6 +203,21 @@ bool block_job_is_internal(BlockJob *job)
> >>     return (job->id == NULL);
> >> }
> >>
> >>+static bool block_job_started(BlockJob *job)
> >>+{
> >>+    return job->co;
> >>+}
> >>+
> >>+void block_job_start(BlockJob *job)
> >>+{
> >>+    assert(job && !block_job_started(job) && job->paused &&
> >>+           !job->busy && job->driver->start);
> >>+    job->paused = false;
> >>+    job->busy = true;
> >>+    job->co = qemu_coroutine_create(job->driver->start, job);
> >>+    qemu_coroutine_enter(job->co);
> >>+}
> >
> >We allow the user to pause a job while it's not started yet. You
> >classified this as "harmless". But if we accept this, can we really
> >unconditionally enter the coroutine even if the job has been paused?
> >Can't a user expect that a job remains in paused state when they
> >explicitly requested a pause and the job was already internally paused,
> >like in this case by block_job_create()?
> >
> 
> What will end up happening is that we'll enter the job, and then it'll pause
> immediately upon entrance. Is that a problem?
> 
> If the jobs themselves are not checking their pause state fastidiously, it
> could be (but block/backup does -- after it creates a write notifier.)
> 
> Do we want a stronger guarantee here?
> 
> Naively I think it's OK as-is, but I could add a stronger boolean in that
> lets us know if it's okay to start or not, and we could delay the actual
> creation and start until the 'resume' comes in if you'd like.
> 
> I'd like to avoid the complexity if we can help it, but perhaps I'm not
> thinking carefully enough about the existing edge cases.
> 

Is there any reason we can't just use job->pause_count here?  When the job
is created, set job->paused = true, and job->pause_count = 1.  In the
block_job_start(), check the pause_count prior to qemu_coroutine_enter():

    void block_job_start(BlockJob *job)
    {
        assert(job && !block_job_started(job) && job->paused &&
              !job->busy && job->driver->start);
        job->co = qemu_coroutine_create(job->driver->start, job);
        job->paused = --job->pause_count > 0;
        if (!job->paused) {
            job->busy = true;
            qemu_coroutine_enter(job->co);
        }
    }


> >The same probably also applies to the internal job pausing during
> >bdrv_drain_all_begin/end, though as you know there is a larger problem
> >with starting jobs under drain_all anyway. For now, we just need to keep
> >in mind that we can neither create nor start a job in such sections.
> >
> 
> Yeah, there are deeper problems there. As long as the existing critical
> sections don't allow us to create jobs (started or not) I think we're
> probably already OK.
> 
> >Kevin
> >

  reply	other threads:[~2016-11-08  2:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-02 17:50 [Qemu-devel] [PATCH v3 0/6] jobs: fix transactional race condition John Snow
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 1/6] blockjob: fix dead pointer in txn list John Snow
2016-11-08  2:47   ` Jeff Cody
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 2/6] blockjob: add .clean property John Snow
2016-11-08  2:51   ` Jeff Cody
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 3/6] blockjob: add .start field John Snow
2016-11-08  2:58   ` Jeff Cody
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 4/6] blockjob: add block_job_start John Snow
2016-11-03 12:17   ` Kevin Wolf
2016-11-08  2:02     ` John Snow
2016-11-08  2:05       ` Jeff Cody [this message]
2016-11-08  2:20         ` John Snow
2016-11-08  9:16         ` Kevin Wolf
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 5/6] blockjob: refactor backup_start as backup_job_create John Snow
2016-11-03 13:17   ` Kevin Wolf
2016-11-08  5:41     ` John Snow
2016-11-08  9:11       ` Kevin Wolf
2016-11-08 15:24         ` John Snow
2016-11-08 18:30           ` Jeff Cody
2016-11-08  3:14   ` Jeff Cody
2016-11-02 17:50 ` [Qemu-devel] [PATCH v3 6/6] iotests: add transactional failure race test John Snow
2016-11-03 13:21 ` [Qemu-devel] [PATCH v3 0/6] jobs: fix transactional race condition Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161108020528.GC8956@localhost.localdomain \
    --to=jcody@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).