From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55D31C6FD1C for ; Thu, 23 Mar 2023 08:12:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231311AbjCWIMu (ORCPT ); Thu, 23 Mar 2023 04:12:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230310AbjCWIMt (ORCPT ); Thu, 23 Mar 2023 04:12:49 -0400 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB46123669 for ; Thu, 23 Mar 2023 01:12:41 -0700 (PDT) Received: by verein.lst.de (Postfix, from userid 2407) id 1964068AA6; Thu, 23 Mar 2023 09:12:38 +0100 (CET) Date: Thu, 23 Mar 2023 09:12:37 +0100 From: Christoph Hellwig To: Qu Wenruo Cc: Christoph Hellwig , Chris Mason , Josef Bacik , David Sterba , Johannes Thumshirn , linux-btrfs@vger.kernel.org Subject: Re: [PATCH 03/10] btrfs: offload all write I/O completions to a workqueue Message-ID: <20230323081237.GA21669@lst.de> References: <20230314165910.373347-1-hch@lst.de> <20230314165910.373347-4-hch@lst.de> <2aa047a7-984e-8f6f-163e-8fe6d12a41d8@gmx.com> <20230320123059.GB9008@lst.de> <20230321125550.GB10470@lst.de> <5eebb0fc-0be3-c313-27cd-4e11a7b04405@gmx.com> <20230322083258.GA23315@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Thu, Mar 23, 2023 at 04:07:28PM +0800, Qu Wenruo wrote: >>> And if a work load can only be deadlock free using the default max_active, >>> but not any value smaller, then I'd say the work load itself is buggy. >> >> Anything that has an interaction between two instances of a work_struct >> can deadlock. Only a single execution context is guaranteed (and even >> that only with WQ_MEM_RECLAIM), and we've seen plenty of deadlocks due >> to e.g. only using a global workqueue in file systems or block devices >> that can stack. > > Shouldn't we avoid such cross workqueue workload at all cost? Yes, btrfs uses per-sb workqueues. As do most other places now, but there as a bit of a learning curve years ago. >> So this is the first time I see an actual explanation, thanks for that >> first. If this is the reason we should apply the max_active to all >> workqueus that do csum an compression work, but not to other random >> workqueues. > > If we're limiting the max_active for certain workqueues, then I'd say why > not to all workqueues? > > If we have some usage relying on the amount of workers, at least we should > be able to expose it and fix it. Again, btrfs is the odd one out allowing the user to set arbitrary limits. This code predates using the kernel workqueues, and I'm a little doubtful it still is useful. But for that I need to figure out why it was even be kept when converting btrfs to use workqueues. > (IIRC we should have a better way with less cross-workqueue dependency) I've been very actively working on reducing the amount of different workqueues. This series is an important part of that.