From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89A4AC4363A for ; Mon, 26 Oct 2020 22:40:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 33CBF20882 for ; Mon, 26 Oct 2020 22:40:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="rgve2l5u" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394639AbgJZWk4 (ORCPT ); Mon, 26 Oct 2020 18:40:56 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:47166 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2394638AbgJZWk4 (ORCPT ); Mon, 26 Oct 2020 18:40:56 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 09QMcUqh087964; Mon, 26 Oct 2020 22:40:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=corp-2020-01-29; bh=NVbpzASoIzykzgwS7XwDPPV19MK6NS0ZMi+GT2ZfpOA=; b=rgve2l5ujflzRq8LC2aqxLqUS0/hQQfkj+98H8GPbaUIwPn3X+4SKyLKEv80/QgR6J50 xMWrTcBoEDDHPJH0RD+XT0qvu1M3c6BwtcGBptt8xt/XArwTol/whspjeVlXO3kEatVK 6g75d5j0scebRfnfxRDEfyvn19yfsZtvTOFbwHg8mDoi8qFCjudwuBer9biIZoZEFvyC prtaD+oZ0B/18LbjLZZNwunwyGjR1VyZW37Ca/0SSmVH2xejhnCL4kqNFbBrsqZMbMdK YqBUAolmWqDfGyWr9f2w6orQTUD94zhBRWwPt9YZKv3vg1OQHrj+yS0G79vEjDsAJms8 5Q== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 34dgm3vq98-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 26 Oct 2020 22:40:53 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 09QMVHFO188309; Mon, 26 Oct 2020 22:40:53 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3030.oracle.com with ESMTP id 34cx6v92kp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Oct 2020 22:40:53 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 09QMeqN8010819; Mon, 26 Oct 2020 22:40:52 GMT Received: from localhost (/10.159.145.170) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Oct 2020 15:40:52 -0700 Date: Mon, 26 Oct 2020 15:40:51 -0700 From: "Darrick J. Wong" To: Dave Chinner Cc: linux-xfs@vger.kernel.org Subject: Re: [PATCH 1/7] workqueue: bound maximum queue depth Message-ID: <20201026224051.GC347246@magnolia> References: <20201022051537.2286402-1-david@fromorbit.com> <20201022051537.2286402-2-david@fromorbit.com> <20201025044114.GA347246@magnolia> <20201026222943.GV7391@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201026222943.GV7391@dread.disaster.area> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9786 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 spamscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 mlxscore=0 suspectscore=1 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2010260146 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9786 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 lowpriorityscore=0 impostorscore=0 adultscore=0 bulkscore=0 spamscore=0 phishscore=0 mlxlogscore=999 suspectscore=1 clxscore=1015 mlxscore=0 malwarescore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2010260147 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org On Tue, Oct 27, 2020 at 09:29:43AM +1100, Dave Chinner wrote: > On Sat, Oct 24, 2020 at 09:41:14PM -0700, Darrick J. Wong wrote: > > On Thu, Oct 22, 2020 at 04:15:31PM +1100, Dave Chinner wrote: > > > @@ -140,6 +164,7 @@ workqueue_add( > > > > > > /* Now queue the new work structure to the work queue. */ > > > pthread_mutex_lock(&wq->lock); > > > +restart: > > > if (wq->next_item == NULL) { > > > assert(wq->item_count == 0); > > > ret = -pthread_cond_signal(&wq->wakeup); > > > @@ -150,6 +175,16 @@ workqueue_add( > > > } > > > wq->next_item = wi; > > > } else { > > > + /* throttle on a full queue if configured */ > > > + if (wq->max_queued && wq->item_count == wq->max_queued) { > > > + pthread_cond_wait(&wq->queue_full, &wq->lock); > > > > I ported xfs_scrub to use max_queued for the inode scanner, and got a > > hang here. It uses two workqueues -- the first is an unbouned workqueue > > that receives one work item per AG in which each work item calls > > INUMBERS, creates a work item for the returned inode chunk, and throws > > it at the second workqueue. The second workqueue is a bounded workqueue > > that calls BULKSTAT on the INUMBERS work item and then calls the > > iteration function on each bulkstat record returned. > > > > The hang happens when the inumbers workqueue has more than one thread > > running. > > IIUC, that means you have multiple producer threads? IIRC, he usage > in this patchset is single producer, so it won't hit this problem... Right, there are multiple producers, because that seemed like fun. At this stage, I'm merely using this patch in anger (as it were) to prototype a fix to scrub. I'm not even sure it's a reasonable enhancement for xfs_scrub since each of those bulkstat workers will then fight with the inumbers workers for the AGI, but so it goes. It does seem to eliminate the problem of one thread working hard on an AG that has one huge fragmented file while the other threads sit idle, but otherwise I mostly just see more buffer lock contention. The total runtime doesn't change on more balanced filesystems, at least. :P > > Both* threads notice the full workqueue and wait on > > queue_full. One of the workers in the second workqueue goes to pull off > > the next work item, ends up in this if body, signals one of the sleeping > > threads, and starts calling bulkstat. > > > > In the time it takes to wake up the sleeping thread from wq 1, the > > second workqueue pulls far enough ahead that the single thread from wq1 > > never manages to fill wq2 again. Often, the wq1 thread was sleeping so > > that it could add the last inode chunk of that AG to wq2. We therefore > > never wake up the *other* sleeping thread from wq1, and the whole app > > stalls. > > > > I dunno if that's a sane way to structure an inumbers/bulkstat scan, but > > it seemed reasonable to me. I can envision two possible fixes here: (1) > > use pthread_cond_broadcast to wake everything up; or (2) always call > > pthread_cond_wait when we pull a work item off the queue. Thoughts? > > pthread_cond_broadcast() makes more sense, but I suspect there will > be other issues with multiple producers that render the throttling > ineffective. I suspect supporting multiple producers should be a > separate patchset... Making change (2) seems to work for multiple producers, but I guess I'll keep poking at perf to see if I discover anything exciting. --D > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com