From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jens Axboe Subject: Re: ext4 vs btrfs performance on SSD array Date: Fri, 05 Sep 2014 10:50:06 -0600 Message-ID: <5409E9BE.2040002@kernel.dk> References: <20140902000822.GA20473@dastard> <20140902012222.GA21405@infradead.org> <20140903100158.34916d34@notabene.brown> <20140905160808.GA7967@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: NeilBrown , Dave Chinner , Nikolai Grigoriev , linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-raid@vger.kernel.org, linux-mm@kvack.org To: Jeff Moyer , Christoph Hellwig Return-path: Received: from mail-pa0-f52.google.com ([209.85.220.52]:56838 "EHLO mail-pa0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751074AbaIEQuB (ORCPT ); Fri, 5 Sep 2014 12:50:01 -0400 Received: by mail-pa0-f52.google.com with SMTP id eu11so22507287pac.25 for ; Fri, 05 Sep 2014 09:50:01 -0700 (PDT) In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 09/05/2014 10:40 AM, Jeff Moyer wrote: > Christoph Hellwig writes: > >> On Wed, Sep 03, 2014 at 10:01:58AM +1000, NeilBrown wrote: >>> Do we still need maximums at all? >> >> I don't think we do. At least on any system I work with I have to >> increase them to get good performance without any adverse effect on >> throttling. >> >>> So can we just remove the limit on max_sectors and the RAID5 stripe cache >>> size? I'm certainly keen to remove the later and just use a mempool if the >>> limit isn't needed. >>> I have seen reports that a very large raid5 stripe cache size can cause >>> a reduction in performance. I don't know why but I suspect it is a bug that >>> should be found and fixed. >>> >>> Do we need max_sectors ?? > > I'm assuming we're talking about max_sectors_kb in > /sys/block/sdX/queue/. > >> I'll send a patch to remove it and watch for the fireworks.. > > :) I've seen SSDs that actually degrade in performance if I/O sizes > exceed their internal page size (using artificial benchmarks; I never > confirmed that with actual workloads). Bumping the default might not be > bad, but getting rid of the tunable would be a step backwards, in my > opinion. > > Are you going to bump up BIO_MAX_PAGES while you're at it? The reason it's 256 right (or since forever, actually) is that this is one single 4kb page. If you go higher, that would require a higher order allocation. Not impossible, but it's definitely a potential issue. It's a lot saner to string bios at that point, with separate 0 order allocs. -- Jens Axboe