From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76E9014005 for ; Sat, 24 Feb 2024 04:12:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708747958; cv=none; b=CsTIU9wUarpYy5iD4WZihHHbl5PuVcc1+ASO85AIoVWpufcFr9DAPux8w0+14Uw4jyEXTQjNWpUhl4JH/xNKkcPw8DzD4l1t7jUnlVkCCOnhdkbhbbpOcFZVo9IEZ3slwFvtr+JDq4u/pFl54pGzlFQJcDi8eNLpFV72FizlLGk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708747958; c=relaxed/simple; bh=mJnqsHTlNvDBuNXq/GoCgXXCnB7QOAGs6+kqJMB6Tq4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=aoHTI1JkkKnf64YYSgN2wSGdDeGwaoV3pD70fqT6unw1ak9/qz86hq5SDhZtf48aAqVhEb83olXnB0PcoK7+Qm6GrHdvCmWYj2sGa65lWRUX8YkdwrjBvSVrA5qKk8o/Dv49ZFCLxqY2husPIojoJcvfobDDCOi2fd98YLxJWug= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=LfZunUKe; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="LfZunUKe" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=tR7eV9V1q7EsfOqu+JiqJ3BN2VxYXVsdegtSQkqGBns=; b=LfZunUKe9v6IXx3vUAzZFTYsJ6 0XoKbZtJp+4yeL+1vq8+mP05cFSDmYrxsPWppgK2RZPeXhAiy1BGfYB46UidlHFa1xxWLjrx3Xif6 YOvI4YO37nxRq0O5HCoY87f0vnb76e+vBcnivqkLh6E/G8neipB45pUx2ADWtKc2qiDm3SZZ+GmXY ddcMxh9bej1J4ihXDnVKehpTQjxzGa39Zqyb/F3Un4f511ZtiqHqBRw7XVvmRQFDtb5i31crRfNH3 tiVIBWYOmWpVWhrKu9WlgbEO5zBlWMC1ipdiGmjTjeaU+Z8sw7uyt4swmaw6vS5SHDazKkNN/0xHK TGUpFZFg==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rdjOp-00000009UgB-2RmN; Sat, 24 Feb 2024 04:12:31 +0000 Date: Sat, 24 Feb 2024 04:12:31 +0000 From: Matthew Wilcox To: Luis Chamberlain Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm , Daniel Gomez , Pankaj Raghav , Jens Axboe , Dave Chinner , Christoph Hellwig , Chris Mason , Johannes Weiner , Linus Torvalds Subject: Re: [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO Message-ID: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri, Feb 23, 2024 at 03:59:58PM -0800, Luis Chamberlain wrote: > Part of the testing we have done with LBS was to do some performance > tests on XFS to ensure things are not regressing. Building linux is a > fine decent test and we did some random cloud instance tests on that and > presented that at Plumbers, but it doesn't really cut it if we want to > push things to the limit though. What are the limits to buffered IO > and how do we test that? Who keeps track of it? TLDR: Why does the pagecache suck? > ~86 GiB/s on pmem DIO on xfs with 64k block size, 1024 XFS agcount on x86_64 > Vs > ~ 7,000 MiB/s with buffered IO Profile? My guess is that you're bottlenecked on the xa_lock between memory reclaim removing folios from the page cache and the various threads adding folios to the page cache. If each thread has its own file, that would help. If the threads do their own reclaim that would help the page cache ... but then they'd contend on the node's lru lock instead, so just trading one pain for another.