From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com ([209.132.183.28]:54962 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727784AbeKJB1S (ORCPT ); Fri, 9 Nov 2018 20:27:18 -0500 Date: Fri, 9 Nov 2018 10:46:09 -0500 From: Brian Foster Subject: Re: [PATCH] xfs: defer online discard submission to a workqueue Message-ID: <20181109154608.GB5572@bfoster> References: <20181105181021.8174-1-bfoster@redhat.com> <20181105215139.GA3160@infradead.org> <20181106142310.GA2773@bfoster> <20181109150610.GB9153@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181109150610.GB9153@infradead.org> Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: Christoph Hellwig Cc: linux-xfs@vger.kernel.org On Fri, Nov 09, 2018 at 07:06:10AM -0800, Christoph Hellwig wrote: > On Tue, Nov 06, 2018 at 09:23:11AM -0500, Brian Foster wrote: > > My > > understanding is that these discards can stack up and take enough time > > that a limit on outstanding discards is required, which now that I think > > of it makes me somewhat skeptical of the whole serial execution thing. > > Hitting that outstanding discard request limit is what bubbles up the > > stack and affects XFS by holding up log forces, since new discard > > submissions are presumably blocked on completion of the oldest > > outstanding request. > > We don't do strict ordering or request, but eventually requests > waiting for completion will block others from being submitted. > Ok, that's kind of what I expected. > > I'm not quite sure what happens in the block layer if that limit were > > lifted. Perhaps it assumes throttling responsibility directly via > > queues/plugs? I'd guess that at minimum we'd end up blocking indirectly > > somewhere (via memory allocation pressure?) anyways, so ISTM that some > > kind of throttling is inevitable in this situation. What am I missing? > > We'll still block new allocations waiting for these blocks and > other bits. Or to put it another way - if your discard implementation > is slow (independent of synchronous or not) your are going to be in > a world of pain with online discard. That is what it's not default > to start with. Sure, it's not really the XFS bits I was asking about here. This is certainly not a high priority and not a common use case. We're working through some of the other issues in the other sub-thread. In particular, I'm wondering if we can provide broader improvements to the overall mechanism to reduce some of that pain. Brian