From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752042AbbIQPzh (ORCPT ); Thu, 17 Sep 2015 11:55:37 -0400 Received: from mail-io0-f174.google.com ([209.85.223.174]:33649 "EHLO mail-io0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751477AbbIQPzg (ORCPT ); Thu, 17 Sep 2015 11:55:36 -0400 Subject: Re: [PATCH] block: blk-merge: fast-clone bio when splitting rw bios To: Ming Lei References: <1442502807-24377-1-git-send-email-ming.lei@canonical.com> <55FAD9EF.5040100@kernel.dk> Cc: Linux Kernel Mailing List , Christoph Hellwig , Kent Overstreet , Ming Lin , Dongsu Park From: Jens Axboe Message-ID: <55FAE276.8040208@kernel.dk> Date: Thu, 17 Sep 2015 09:55:34 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/17/2015 09:50 AM, Ming Lei wrote: > On Thu, Sep 17, 2015 at 11:19 PM, Jens Axboe wrote: >> On 09/17/2015 09:13 AM, Ming Lei wrote: >>> >>> biovecs has become immutable since v3.13, so it isn't necessary >>> to allocate biovecs for the new cloned bios, then we can save >>> one extra biovecs allocation/copy, and the allocation is often >>> not fixed-length and a bit more expensive. >>> >>> For example, if the 'max_sectors_kb' of null blk's queue is set >>> as 16(32 sectors) via sysfs just for making more splits, this patch >>> can increase throught about ~70% in the sequential read test over >>> null_blk(direct io, bs: 1M). >> >> >> I'd be curious how this compares to before we did the splitting, not >> exceeding the limits through bio_add_page() instead? > > Let me show these test results: > > ---------------------------------------------------------------------------------- > kernel | throught > ---------------------------------------------------------------------------------- > 4.3.0-rc1-next-20150916 | bw=12227MB/s, iops=12227 > ---------------------------------------------------------------------------------- > 4.3.0-rc1-next-20150916 with patch | bw=21011MB/s, iops=21011 > ---------------------------------------------------------------------------------- > v4.2 | > bw=18959MB/s, iops=18958 > ---------------------------------------------------------------------------------- > > So from the above, looks this patch is kind of fix for performance regression > introduced by 54efd50bfd(block: make generic_make_request handle > arbitrarily sized bios), :-) So that's 1MB user IO, and 16KB device limit, correct? If that is the case, then the results make sense. And looks like we're still ahead of the older bio_add_page() approach, which is what I mostly cared about. Thanks! I'll apply this for -rc2. -- Jens Axboe