From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752777AbZHZSmh (ORCPT ); Wed, 26 Aug 2009 14:42:37 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752306AbZHZSmg (ORCPT ); Wed, 26 Aug 2009 14:42:36 -0400 Received: from sj-iport-6.cisco.com ([171.71.176.117]:1850 "EHLO sj-iport-6.cisco.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750802AbZHZSmf (ORCPT ); Wed, 26 Aug 2009 14:42:35 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAF8elUqrR7MV/2dsb2JhbADAQog5kDYFhBo X-IronPort-AV: E=Sophos;i="4.44,281,1249257600"; d="scan'208";a="375861321" From: Roland Dreier To: Oleg Nesterov Cc: linux-kernel@vger.kernel.org, Dmitry Torokhov Subject: Re: Is adding requeue_delayed_work() a good idea References: <20090821115547.GA6901@redhat.com> <20090824180112.GC16202@redhat.com> <20090825093906.GA3020@redhat.com> X-Message-Flag: Warning: May contain useful information Date: Wed, 26 Aug 2009 11:42:36 -0700 In-Reply-To: <20090825093906.GA3020@redhat.com> (Oleg Nesterov's message of "Tue, 25 Aug 2009 11:39:06 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.91 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-OriginalArrivalTime: 26 Aug 2009 18:42:37.0395 (UTC) FILETIME=[FBB0D230:01CA267C] Authentication-Results: sj-dkim-1; header.From=rdreier@cisco.com; dkim=pass ( sig from cisco.com/sjdkim1004 verified; ); Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > OK, in this case I think we have a simple solution, > > // like cancel_delayed_work, but uses del_timer(). > // this means, if it returns 0 the timer function may be > // running and the queueing is in progress. The caller > // can't rely on flush_workqueue/etc > static inline int __cancel_delayed_work(struct delayed_work *work) > { > int ret; > > ret = del_timer(&work->timer); > if (ret) > work_clear_pending(&work->work); > return ret; > } > > Now, you can do > > spin_lock_irqsave(&mydata_lock); > new_timeout = add_item_to_timeout_list(); > > __cancel_delayed_work(&process_timeout_list_work); > queue_delayed_work(wq, &process_timeout_list_work, new_timeout); > > spin_unlock_irqsave(&mydata_lock); > > If queue_delayed_work() fails, this means that WORK_STRUCT_PENDING is set, > dwork->work is already queued or the queueing is in progress. In both > cases it will run "soon" as if we just called queue_work(&dwork->work). This looks like it would work well. If we can get this into 2.6.32 then I will drop my patch and switch to this approach instead. > But this assumes nobody else does queue_delayed_work(dwork, HUGE_DELAY) in > parallel, otherwise we can lose the race and another caller can setup > HUGE_DELAY timeout. In my case this is fine -- all uses of queue_delayed_work() are synchronized with the same lock. So any place that tries to shorten the timeout will succeed. > In particular, if process_timeout_list_work->func() itself uses > queue_delay_work() to re-arm itself we can race. Bu t I think it is always > possible to do something to synchronize with work->func, for example > work->func() can re-arm itself _before_ it scans timeout_list (under the > same lock). This way, if re-queue code above fails because work->func() > wins, work->func() must see the new additions to timeout_list. In my case, work function does do queue_delayed_work(), but with the same lock as everyone else held. So there should be no race. Thanks, Roland