From mboxrd@z Thu Jan  1 00:00:00 1970
From: Christoph Hellwig <hch@infradead.org>
Subject: Re: [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in
 bdi_writeback_workfn.
Date: Thu, 26 Sep 2013 10:47:02 -0700
Message-ID: <20130926174702.GA12274@infradead.org>
References: <201309100811409182210@gmail.com>
 <2013092516084285960311@gmail.com>
 <20130925224351.GJ26872@dastard>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: majianpeng <majianpeng@gmail.com>, tj <tj@kernel.org>,
	axboe <axboe@kernel.dk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
To: Dave Chinner <david@fromorbit.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from bombadil.infradead.org ([198.137.202.9]:49980 "EHLO
	bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753498Ab3IZRrK (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Thu, 26 Sep 2013 13:47:10 -0400
Content-Disposition: inline
In-Reply-To: <20130925224351.GJ26872@dastard>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Thu, Sep 26, 2013 at 08:43:51AM +1000, Dave Chinner wrote:
> You're just papering over the larger problem, in that the writeback
> work is running concurrently with the bdi_unregister() function that
> is tearing the bdi down. You should try to fix the underlying race
> condition, as documented in bdi_destroy.

The problem is even worse than that, and it's the lack of a proper
refcounting on the bdi as seen by gems like bdi_prune_sb and the moving
of writeback requests in bdi_destroy.  The right fix is to dynamically
allocate the bdi (or at least the bdi_writeback), and make sure that we
keep it around as long as a filesystem and thus the writeback code
refers to.  Then a block device going away can just set a flag to stop
writeback from trying instead of having the bdi ripped out underneath
it which is guaranteed to fail and historically has failed in a large
number of misterious ways.