From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id 555897CDA for ; Mon, 2 May 2016 13:23:49 -0500 (CDT) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay2.corp.sgi.com (Postfix) with ESMTP id 1B97930404E for ; Mon, 2 May 2016 11:23:45 -0700 (PDT) Received: from newverein.lst.de (verein.lst.de [213.95.11.211]) by cuda.sgi.com with ESMTP id kC7xMushWPhDfZYz (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Mon, 02 May 2016 11:23:43 -0700 (PDT) Date: Mon, 2 May 2016 20:23:41 +0200 From: Christoph Hellwig Subject: Re: iomap infrastructure and multipage writes V2 Message-ID: <20160502182341.GA7077@lst.de> References: <1460494382-14547-1-git-send-email-hch@lst.de> <20160413215442.GS567@dastard> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160413215442.GS567@dastard> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Dave Chinner Cc: rpeterso@redhat.com, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com Hi Dave, sorry for taking forever to get back to this - travel to LSF and some other meetings and a dealine last week didn't leave me any time for XFS work. On Thu, Apr 14, 2016 at 07:54:42AM +1000, Dave Chinner wrote: > Christoph, have you done any perf testing of this patchset yet to > check that it does indeed reduce the CPU overhead of large write > operations? I'd also be interested to know if there is any change in > overhead for single page (4k) IOs as well, even though I suspect > there won't be. I've done a lot of testing earlier, and this version also looks very promising. On the sort of hardware I have access to now, the 4k numbers don't change much, but with 1M writes we both increase the write bandwith a little bit and significantly lower the cpu usage. The simple test that demonstrates this is this, the runs are from a 4p VM with 4G of RAM, access to a fast NVMe SSD and a small enough data size so that writeback shouldn't throttle the buffered write path: MNT=/mnt PERF="perf_3.16" # soo smart to have tools in the kernel tree.. #BS=4k #COUNT=65536 BS=1M COUNT=256 $PERF stat dd if=/dev/zero of=$MNT/testfile bs=$BS count=$COUNT with the baseline for-next tree I get the following bandwith and cpu utilization: BS=4k: ~600MB/s 0.856 CPUs utilized ( +- 0.32% ) BS=1M: 1.45GB/s 0.820 CPUs utilized ( +- 0.77% ) with all patches applied: BS=4k: ~610MB/s 0.848 CPUs utilized ( +- 0.36% ) BS=1M: ~1.55GB/s 0.615 CPUs utilized ( +- 0.80% ) This is also visible in the walltime baseline, 4k: real 0m0.540s user 0m0.000s sys 0m0.533s baseline, 1M: real 0m0.310s user 0m0.000s sys 0m0.313s multipage, 4k: real 0m0.541s user 0m0.010s sys 0m0.527s multipage, 1M: real 0m0.272s user 0m0.000s sys 0m0.263s _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs