From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0ADE7C433F5 for ; Mon, 18 Oct 2021 17:49:12 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9116F60EB1 for ; Mon, 18 Oct 2021 17:49:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9116F60EB1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To: Subject:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=8eHyIh60ZUzo3JWH2PHiHe4cTmG/UJai+TUUoRo1bPo=; b=1sOfIQzkydu5zAw4+BRD8et9ZJ 2Gge7zlf2ijlVJF7fncpz0/77eH2RJgB8+Mv6hs0TgTK+k7pH3/rjgf/YWxLnZK/2hFvnAgrKnu6b XyIAANJTJTRcAXQ+xCWXvf+qS2JhLYDdRAtV/XBAHtJQpq/s7+lPcz+ngAAGT6oeg+yg84VnfaG9D nh+rDCu18VbjxtIzyvJojKYDNybwOvo+qZV41HwgNIHisfuWce+/0idiURvBmesB8O85vUU1mFMvM SmWDG5vqPPRJaUXXuSFZkHbbFjaTRE5WsXC9Asxmg51Nt0MOB04c/m9DSV/0B+HucrbtAj0vz1WyT +sGis71A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mcWl1-00GhJ1-Ln; Mon, 18 Oct 2021 17:49:07 +0000 Received: from mail-il1-x133.google.com ([2607:f8b0:4864:20::133]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1mcWd4-00GgOd-O1 for linux-nvme@lists.infradead.org; Mon, 18 Oct 2021 17:40:56 +0000 Received: by mail-il1-x133.google.com with SMTP id k3so5364159ilo.7 for ; Mon, 18 Oct 2021 10:40:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=8eHyIh60ZUzo3JWH2PHiHe4cTmG/UJai+TUUoRo1bPo=; b=FxDre2hjyoHZ64BANF3MmORptiQUy9H6AZXx6LsuyaBb3BcaZC0U/QrU21BtTujGBo 8pKXev+INoMpwpLRfH86fYXpprj4xYaVKSen9V7xFxUy/Adi3YagOWpZv1TCL1fYRKF4 KJjdISFmhkQQ0L4EmrQXCfcR5w3Zzj3aqn6DzOl13eefz8vxGa1vYg1bewxzJ9VB28da 3UiILqBGRheOtvSI5gJ9xtk7xHUS/Gucotz2TgqyyRfnlasGRztSAeszEgY1u2mV13ER EMsST0UXfQiH71wz4IGsonA4VDgwXa5gnJGRFiHsgHIEksPy/aZTeVCzAUrIhmXl9xQs p/ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=8eHyIh60ZUzo3JWH2PHiHe4cTmG/UJai+TUUoRo1bPo=; b=A4gUv+0dnJd621H2rsAEDg2M/x/Nbes+DZ/FI0T15SEQUOXZlX9cJqFZ+l3Qpo/RgO 7rewA3AXeW+wX+4AiXBqZJlC0ZIkuA5sp/bcsvK0ff5xAphCCyRD2OIfbgIpDi8HaszI zD3k0wA0NOYSbBmxReIanzZwXTs3YWOfTLaFsducedpBN+Z/nQq/bGKLawzIEm7Ct5LQ A1pPv5CtAJlIHqNNNNk95+gB9p7xdMtXkb5OT0YuZyNatI6zNik4kumenfPr3GRc+1wE aqvietFZvZv9KF4gyIynyc73QSkD7BGQWtZ9zau8xUgqOEfGclBjYc7NKBLFMnEoHfzb m7Mg== X-Gm-Message-State: AOAM533KZGVDeGEVnqsSgwH8CIMu2mlhRCm+iF3LOm56S4dfCKzlxL8r 0RmkbMF39n7Y+I5b0u5WL9K1QA== X-Google-Smtp-Source: ABdhPJwRrr1OSerCMeedIq2RIL4WChs1nkzMA+HxEg+LQORSp5DecggK+ro0hO8CL+NowqI2hdNiZw== X-Received: by 2002:a05:6e02:14d3:: with SMTP id o19mr15105521ilk.257.1634578853304; Mon, 18 Oct 2021 10:40:53 -0700 (PDT) Received: from [192.168.1.30] ([207.135.234.126]) by smtp.gmail.com with ESMTPSA id s6sm3131684ilv.18.2021.10.18.10.40.51 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 18 Oct 2021 10:40:52 -0700 (PDT) Subject: Re: don't use ->bd_inode to access the block device size v3 To: Christoph Hellwig Cc: Coly Li , Mike Snitzer , Song Liu , David Sterba , Josef Bacik , Theodore Ts'o , OGAWA Hirofumi , Dave Kleikamp , Ryusuke Konishi , Anton Altaparmakov , Konstantin Komarov , Kees Cook , Phillip Lougher , Jan Kara , linux-block@vger.kernel.org, dm-devel@redhat.com, drbd-dev@lists.linbit.com, linux-bcache@vger.kernel.org, linux-raid@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, target-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org, jfs-discussion@lists.sourceforge.net, linux-nfs@vger.kernel.org, linux-nilfs@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net, ntfs3@lists.linux.dev, reiserfs-devel@vger.kernel.org References: <20211018101130.1838532-1-hch@lst.de> <4a8c3a39-9cd3-5b2f-6d0f-a16e689755e6@kernel.dk> <20211018171843.GA3338@lst.de> From: Jens Axboe Message-ID: <2f5dcf79-8419-45ff-c27c-68d43242ccfe@kernel.dk> Date: Mon, 18 Oct 2021 11:40:51 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20211018171843.GA3338@lst.de> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211018_104054_824771_4FEA4631 X-CRM114-Status: GOOD ( 22.23 ) X-Mailman-Approved-At: Mon, 18 Oct 2021 10:49:06 -0700 X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On 10/18/21 11:18 AM, Christoph Hellwig wrote: > On Mon, Oct 18, 2021 at 11:16:08AM -0600, Jens Axboe wrote: >> This looks good to me. Followup question, as it's related - I've got a >> hacky patch that caches the inode size in the bdev: >> >> https://git.kernel.dk/cgit/linux-block/commit/?h=perf-wip&id=c754951eb7193258c35a574bd1ccccb7c4946ee4 >> >> so we don't have to dip into the inode itself for the fast path. While >> it's obviously not something being proposed for inclusion right now, is >> there a world in which we can make something like that work? > > There's just two places that update i_size for block devices: > set_capacity and bdev_set_nr_sectors. So you just need to update > bd_nr_sectors there and you're done. This on top of your patches should do the trick, then. commit eebb7c5048163985fb21d6cb740ebac78cb46051 Author: Jens Axboe Date: Mon Oct 18 11:39:45 2021 -0600 block: cache inode size in bdev Reading the inode size brings in a new cacheline for IO submit, and it's in the hot path being checked for every single IO. When doing millions of IOs per core per second, this is noticeable overhead. Cache the nr_sectors in the bdev itself. Signed-off-by: Jens Axboe diff --git a/block/genhd.c b/block/genhd.c index 759bc06810f8..53495e3391e3 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -58,6 +58,7 @@ void set_capacity(struct gendisk *disk, sector_t sectors) spin_lock(&bdev->bd_size_lock); i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT); + bdev->bd_nr_sectors = sectors; spin_unlock(&bdev->bd_size_lock); } EXPORT_SYMBOL(set_capacity); diff --git a/block/partitions/core.c b/block/partitions/core.c index 9dbddc355b40..66ef9bc6d6a1 100644 --- a/block/partitions/core.c +++ b/block/partitions/core.c @@ -91,6 +91,7 @@ static void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors) { spin_lock(&bdev->bd_size_lock); i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT); + bdev->bd_nr_sectors = sectors; spin_unlock(&bdev->bd_size_lock); } diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 472e55e0e94f..fe065c394fff 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -39,6 +39,7 @@ struct bio_crypt_ctx; struct block_device { sector_t bd_start_sect; + sector_t bd_nr_sectors; struct disk_stats __percpu *bd_stats; unsigned long bd_stamp; bool bd_read_only; /* read-only policy */ diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 7b0326661a1e..001f617f82da 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -238,7 +238,7 @@ static inline sector_t get_start_sect(struct block_device *bdev) static inline loff_t bdev_nr_bytes(struct block_device *bdev) { - return i_size_read(bdev->bd_inode); + return bdev->bd_nr_sectors; } static inline sector_t bdev_nr_sectors(struct block_device *bdev) -- Jens Axboe