From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Snitzer Subject: Re: dm snapshot: stop merging using a completion Date: Sun, 6 Dec 2009 01:43:39 -0500 Message-ID: <20091206064338.GA15389@redhat.com> References: <1260037484-20905-1-git-send-email-snitzer@redhat.com> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: Mikulas Patocka Cc: dm-devel@redhat.com, Alasdair G Kergon List-Id: dm-devel.ids On Sat, Dec 05 2009 at 9:01pm -0500, Mikulas Patocka wrote: > On Sat, 5 Dec 2009, Mike Snitzer wrote: > > > Switch stop_merge() from using a busy loop to a completion event. > > > > stop_merge() now requests merging be shutdown using the > > 'merge_completion' pointer (instead of the 'merge_shutdown' flag). This > > is accomplished by testing if 'merge_completion' is not NULL in > > snapshot_merge_process(). stop_merge() allocates its completion on the > > stack and assigns it to the 'merge_completion' pointer in the snapshot. > > 'merge_completion' is protected by the snapshot's lock. > > > > Also changed the 'merge_running' flag from int to atomic_t. > > No, there's a bug: > > > static void stop_merge(struct dm_snapshot *s) > > { > > - while (s->merge_running) { > > - s->merge_shutdown = 1; > > - msleep(1); > > + DECLARE_COMPLETION_ONSTACK(merge_stopped); > > + if (atomic_read(&s->merge_running)) { > > --- if the merge stops exactly at this point (because it gets finished or > because of an i/o error), we are waiting for a completion that will be > never signalled. Yes, valid point. But for this rare corner case we could just use wait_for_completion_timeout() with a fairly large timeout; like 30 sec? That actually isn't a great option (racey)... How about if the 'shut:' code paths also checked for s->merge_completion and complete() if it is not NULL? Which means that check and related complete() code would become a function. > > + down_write(&s->lock); > > + s->merge_completion = &merge_stopped; > > + up_write(&s->lock); > > + wait_for_completion(&merge_stopped); > > } > > - s->merge_shutdown = 0; > > } > > > > /* > > For Alasdair: do you get the problem? If I write it with msleep() > correctly, you keep on complaining how unclean it is --- if it is written > with completions and it is wrong (because they are just harder to use > correctly than simple variables and msleep), you tend to support it. Now > you see in practice how complex constructs tend to trigger bugs. > > Mike: I thought that the completion would be in struct dm_snapshot. But > maybe, try it with wait_on_bit / wake_up_bit / test_bit / set_bit etc., it > may be easier than completions. I can look at it; but I think using a completion can work. Mike