From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55780CA9ED3 for ; Tue, 5 Nov 2019 03:33:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 274EA21882 for ; Tue, 5 Nov 2019 03:33:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="H+doWd0W" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729836AbfKEDdh (ORCPT ); Mon, 4 Nov 2019 22:33:37 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:35380 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728910AbfKEDdh (ORCPT ); Mon, 4 Nov 2019 22:33:37 -0500 Received: by mail-pg1-f194.google.com with SMTP id q22so5396859pgk.2 for ; Mon, 04 Nov 2019 19:33:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=tlOex+gWQYpxiWEeGdiftgedeC2MlUqgIjmdPHTDpf0=; b=H+doWd0WkLNu+BAH1vaVd+0Fsd2zJwxeBtPr+39yxhFdR1sLPRgZKpvy18csTXKhkM qNmtHn9mzma/GA5hHBGXaVF/OAO08h3OPqKM67DGnGlUbv2c677ltX5obAUhYyeYbUXw jAAa2ZsYJOvrI5yD3GgE2IsARdmkSPWUJfPvo1ifdjYHTJuTufI/lPSRL0Y08uDCfs6N phFimrv5nSpfYdOyxY2Z+capVO+xG2f282wgqHloOkKjwgD6zi07OC4b/zeoGK1KnQUX rnb8ud8xrzoethH9N9WG3u+0nI+HecUyGQyJf5l71JSHTmYI+2myTQ4yiJ8BdlTk7Anx 0MSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=tlOex+gWQYpxiWEeGdiftgedeC2MlUqgIjmdPHTDpf0=; b=TivT6D2ti+ad91GHKfaihfKkzJzkXpixO7PKh0o70hvCvJP3Q9jrATM53YEel8CzGo C9l6va+nPMrQpNKztJclMhPhIh5tcY/4Yrpwsojhn4RhZgNeLsELBTusZa8ELXJi5NqC RgDUdpy3943qAfyJUYVegbBI8uWjjX3OfPkYi2ReuxIfZjt1C8jRGdGVvtciZlniHOAj kIitbY4sZAYK6MHV3zjZM6rgpTnPCItBT+sH4Gr/HumEWR8q8tZZbtZqU6QALkJxZXTU 2LLuHX3tHy4DOTjAg+uiWPgqA7y5pihJqibixRlwDoSbAfU3pks8r6ltFl2iASnr4JYu elng== X-Gm-Message-State: APjAAAU3F2P97FIj++7NHz91xYtVT9kyQY7iAIiu0oyAFQwujqIakVY8 +gfWLt/+FUnOgorGoPvXRUqTkg== X-Google-Smtp-Source: APXvYqzEKeOkFh9vf47GMd6yj2FMPFw2hdlQUS3eu6OZlU4mrLvUX9yNWOgohfsiTZaIAhCT4qa2ew== X-Received: by 2002:a17:90a:6583:: with SMTP id k3mr3471330pjj.50.1572924816378; Mon, 04 Nov 2019 19:33:36 -0800 (PST) Received: from [192.168.1.188] ([66.219.217.79]) by smtp.gmail.com with ESMTPSA id q26sm15266257pgk.60.2019.11.04.19.33.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 04 Nov 2019 19:33:35 -0800 (PST) Subject: Re: [PATCH V4] block: optimize for small block size IO To: Kent Overstreet Cc: Ming Lei , Christoph Hellwig , linux-block@vger.kernel.org, Coly Li , Keith Busch , linux-bcache@vger.kernel.org References: <20191104181403.GA8984@kmo-pixel> <20191104181541.GA21116@infradead.org> <20191104181742.GC8984@kmo-pixel> <20191104184217.GD8984@kmo-pixel> <20191105011135.GD11436@ming.t460p> <20191105021130.GB18564@moria.home.lan> <20191105022046.GF11436@ming.t460p> <20191105023002.GC18564@moria.home.lan> <20191105031417.GA5872@moria.home.lan> From: Jens Axboe Message-ID: Date: Mon, 4 Nov 2019 20:33:32 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20191105031417.GA5872@moria.home.lan> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 11/4/19 8:14 PM, Kent Overstreet wrote: > On Mon, Nov 04, 2019 at 07:38:42PM -0700, Jens Axboe wrote: >> This is where my knee jerk at the initial "partial completions" and >> "should be trivial" start to kick in. I don't think they are >> necessarily hard, but they aren't free either. And you'd need to be >> paying that atomic_dec cost for every IO. > > No need - you added the code to avoid that atomic dec for bi_remaining > in the common case, the same approach will work here. I guess that would work for the common case of not splitting. If we split, then it's OK to pay a higher cost. We would have anyway, with the existing code. >> currently have to do, maybe not... If it's a clear win, then it'd be >> an interesting path to pursue. But we probably won't have that answer >> until at least a hacky version is done as proof of concept. >> >> On the upside, it'd simplify things to just have the mapping in one >> place, when the request is setup. Though until all drivers do that >> (which I worry will be never), then we'd be stuck with both. Maybe >> that's a bit to pessimistic, should be easier now since we just have >> blk-mq. > > blk_rq_map_sg isn't called from _that_ many places, I suspect once > it's figured out for one driver the rest won't be that bad. It's definitely easier than it would have been, most things are pretty streamlined now with the blk-mq conversion. And the ones that don't call blk_rq_map_sg() usually don't do DMA on the requests. Sizes tend to be more arbitrary there, and not hard boundaries. > And even if some drivers remain unconverted, I personally _much_ > prefer this approach to more special case fast paths, and I bet this > approach will be faster anyways. > > Also - regarding driver allocating of the sglists, I think most high > performance drivers preallocate a pool of sglists that are all sized > to what the device is capable of. They do, but most of them probably also assume on sg list per request. We'd have one request now instead of multiple, so either serializing it (which would definitely suck for some common cases), or doing something very funky with mapping to multiple requests at the same time. But I don't think we should argue this much more. If someone wants to do the work to make this work, even a prototype, it's much better to argue over actual code then potential issues and wins now. -- Jens Axboe