From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S932670AbdIFOvs (ORCPT <rfc822;w@1wt.eu>);
        Wed, 6 Sep 2017 10:51:48 -0400
Received: from mail-qk0-f196.google.com ([209.85.220.196]:36170 "EHLO
        mail-qk0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S932562AbdIFOvn (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 6 Sep 2017 10:51:43 -0400
X-Google-Smtp-Source: AOwi7QAWd7Bv8BsHXpAihtZJcU5X9xJbFDbMKbPEGc8MQdwcQxr1LIQofUMUvQE2FiRGKVsd1RFWkA==
Date: Wed, 6 Sep 2017 07:51:39 -0700
From: Tejun Heo <tj@kernel.org>
To: David Howells <dhowells@redhat.com>
Cc: linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org,
        Lai Jiangshan <jiangshanlai@gmail.com>, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 01/11] workqueue: Add a decrement-after-return and
 wake if 0 facility
Message-ID: <20170906145139.GO1774378@devbig577.frc2.facebook.com>
References: <20170905132951.GB1774378@devbig577.frc2.facebook.com>
 <150428045304.25051.1778333106306853298.stgit@warthog.procyon.org.uk>
 <27489.1504623016@warthog.procyon.org.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <27489.1504623016@warthog.procyon.org.uk>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello, David.

On Tue, Sep 05, 2017 at 03:50:16PM +0100, David Howells wrote:
> With one of my latest patches to AFS, there's a set of cell records, where
> each cell has a manager work item that mainains that cell, including
> refreshing DNS records and excising expired records from the list.  Performing
> the excision in the manager work item makes handling the fscache index cookie
> easier (you can't have two cookies attached to the same object), amongst other
> things.
> 
> There's also an overseer work item that maintains a single expiry timer for
> all the cells and queues the per-cell work items to do DNS updates and cell
> removal.
> 
> The reason that the overseer exists is that it makes it easier to do a put on
> a cell.  The cell decrements the cell refcount and then wants to schedule the
> cell for destruction - but it's no longer permitted to touch the cell.  I
> could use atomic_dec_and_lock(), but that's messy.  It's cleaner just to set
> the timer on the overseer and leave it to that.
> 
> However, if someone does rmmod, I have to be able to clean everything up.  The
> overseer timer may be queued or running; the overseer may be queued *and*
> running and may get queued again by the timer; and each cell's work item may
> be queued *and* running and may get queued again by the manager.

Thanks for the detailed explanation.

> > Why can't it be done via the usual "flush from exit"?
> 
> Well, it can, but you need a flush for each separate level of dependencies,
> where one dependency will kick off another level of dependency during the
> cleanup.
> 
> So what I think I would have to do is set a flag to say that no one is allowed
> to set the timer now (this shouldn't happen outside of server or volume cache
> clearance), delete the timer synchronously, flush the work queue four times
> and then do an RCU barrier.
> 
> However, since I have volumes with dependencies on servers and cells, possibly
> with their own managers, I think I may need up to 12 flushes, possibly with
> interspersed RCU barriers.

Would it be possible to isolate work items for the cell in its own
workqueue and use drain_workqueue()?  Separating out flush domains is
one of the main use cases for dedicated workqueues after all.

> It's much simpler to count out the objects than to try and get the flushing
> right.

I still feel very reluctant to add generic counting & trigger
mechanism to work items for this.  I think it's too generic a solution
for a very specific problem.

Thanks.

-- 
tejun