From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60626) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c3w1h-0006BY-3i for qemu-devel@nongnu.org; Mon, 07 Nov 2016 21:20:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c3w1g-0003uh-1Z for qemu-devel@nongnu.org; Mon, 07 Nov 2016 21:20:41 -0500 References: <1478109056-25198-1-git-send-email-jsnow@redhat.com> <1478109056-25198-5-git-send-email-jsnow@redhat.com> <20161103121743.GB5352@noname.redhat.com> <20161108020528.GC8956@localhost.localdomain> From: John Snow Message-ID: <33ccab69-7bda-a082-7758-79ff40290804@redhat.com> Date: Mon, 7 Nov 2016 21:20:32 -0500 MIME-Version: 1.0 In-Reply-To: <20161108020528.GC8956@localhost.localdomain> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v3 4/6] blockjob: add block_job_start List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jeff Cody Cc: Kevin Wolf , vsementsov@virtuozzo.com, qemu-block@nongnu.org, qemu-devel@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com On 11/07/2016 09:05 PM, Jeff Cody wrote: > On Mon, Nov 07, 2016 at 09:02:14PM -0500, John Snow wrote: >> >> >> On 11/03/2016 08:17 AM, Kevin Wolf wrote: >>> Am 02.11.2016 um 18:50 hat John Snow geschrieben: >>>> Instead of automatically starting jobs at creation time via backup_start >>>> et al, we'd like to return a job object pointer that can be started >>>> manually at later point in time. >>>> >>>> For now, add the block_job_start mechanism and start the jobs >>>> automatically as we have been doing, with conversions job-by-job coming >>>> in later patches. >>>> >>>> Of note: cancellation of unstarted jobs will perform all the normal >>>> cleanup as if the job had started, particularly abort and clean. The >>>> only difference is that we will not emit any events, because the job >>>> never actually started. >>>> >>>> Signed-off-by: John Snow >>> >>>> diff --git a/block/commit.c b/block/commit.c >>>> index 20d27e2..5b7c454 100644 >>>> --- a/block/commit.c >>>> +++ b/block/commit.c >>>> @@ -289,10 +289,9 @@ void commit_start(const char *job_id, BlockDriverState *bs, >>>> s->backing_file_str = g_strdup(backing_file_str); >>>> >>>> s->on_error = on_error; >>>> - s->common.co = qemu_coroutine_create(s->common.driver->start, s); >>>> >>>> trace_commit_start(bs, base, top, s, s->common.co); >>> >>> s->common.co is now uninitialised and should probably be removed from >>> the tracepoint arguments. The same is true for mirror and stream. >>> >>>> - qemu_coroutine_enter(s->common.co); >>>> + block_job_start(&s->common); >>>> } >>> >>>> diff --git a/blockjob.c b/blockjob.c >>>> index e3c458c..16c5159 100644 >>>> --- a/blockjob.c >>>> +++ b/blockjob.c >>>> @@ -174,7 +174,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver, >>>> job->blk = blk; >>>> job->cb = cb; >>>> job->opaque = opaque; >>>> - job->busy = true; >>>> + job->busy = false; >>>> + job->paused = true; >>>> job->refcnt = 1; >>>> bs->job = job; >>>> >>>> @@ -202,6 +203,21 @@ bool block_job_is_internal(BlockJob *job) >>>> return (job->id == NULL); >>>> } >>>> >>>> +static bool block_job_started(BlockJob *job) >>>> +{ >>>> + return job->co; >>>> +} >>>> + >>>> +void block_job_start(BlockJob *job) >>>> +{ >>>> + assert(job && !block_job_started(job) && job->paused && >>>> + !job->busy && job->driver->start); >>>> + job->paused = false; >>>> + job->busy = true; >>>> + job->co = qemu_coroutine_create(job->driver->start, job); >>>> + qemu_coroutine_enter(job->co); >>>> +} >>> >>> We allow the user to pause a job while it's not started yet. You >>> classified this as "harmless". But if we accept this, can we really >>> unconditionally enter the coroutine even if the job has been paused? >>> Can't a user expect that a job remains in paused state when they >>> explicitly requested a pause and the job was already internally paused, >>> like in this case by block_job_create()? >>> >> >> What will end up happening is that we'll enter the job, and then it'll pause >> immediately upon entrance. Is that a problem? >> >> If the jobs themselves are not checking their pause state fastidiously, it >> could be (but block/backup does -- after it creates a write notifier.) >> >> Do we want a stronger guarantee here? >> >> Naively I think it's OK as-is, but I could add a stronger boolean in that >> lets us know if it's okay to start or not, and we could delay the actual >> creation and start until the 'resume' comes in if you'd like. >> >> I'd like to avoid the complexity if we can help it, but perhaps I'm not >> thinking carefully enough about the existing edge cases. >> > > Is there any reason we can't just use job->pause_count here? When the job > is created, set job->paused = true, and job->pause_count = 1. In the > block_job_start(), check the pause_count prior to qemu_coroutine_enter(): > > void block_job_start(BlockJob *job) > { > assert(job && !block_job_started(job) && job->paused && > !job->busy && job->driver->start); > job->co = qemu_coroutine_create(job->driver->start, job); > job->paused = --job->pause_count > 0; > if (!job->paused) { > job->busy = true; > qemu_coroutine_enter(job->co); > } > } > Solid point. Let's do it this way. Thanks! > >>> The same probably also applies to the internal job pausing during >>> bdrv_drain_all_begin/end, though as you know there is a larger problem >>> with starting jobs under drain_all anyway. For now, we just need to keep >>> in mind that we can neither create nor start a job in such sections. >>> >> >> Yeah, there are deeper problems there. As long as the existing critical >> sections don't allow us to create jobs (started or not) I think we're >> probably already OK. >> >>> Kevin >>>