From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EA860C04A68 for ; Thu, 28 Jul 2022 03:15:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ta2G9klsvXR+xvUUntB/Z1eBrc1nuW/vuoGWZ4R1jTU=; b=LCVUYht8jK0uMSsDB+CsL9G2YZ ECxt/spj4mbpTKsso+CazNupViTy98mEN5A9uPNjYN3iVcNrKIGcEBKMKufMGI8Wdx4mRXlKONOnG 3Awa/o3cWBBerYorzofqJTfw4u44O26djLyz76bst0Q4JtZVf6NsIwKcsX1olJNtkN7myNr3ypV+D Eh1/Rz8f/5yRbwactg7vBd4A0sgxok/Za0OJsZzTk1IDCltbw6BEmF6zfUT1+4VEvJ/A1k84qJWdg XpHcwkub3LMBT4jGmXggJSpwiFX0+4XlerBVTgmTsRLO+CdfKh+1vg/wqBgN7Kk+7H6att51aUjej 1YKfdurw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oGtzH-003Psp-Qs; Thu, 28 Jul 2022 03:14:59 +0000 Received: from esa2.hgst.iphmx.com ([68.232.143.124]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oGtzE-003PpM-96 for linux-nvme@lists.infradead.org; Thu, 28 Jul 2022 03:14:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1658978096; x=1690514096; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=V9Bt9U8xCLdifnqBhnRTi+jyLLn32+9OAJoJPTVsyn0=; b=ICA6zWiCzMw3sbi7BlJSKM3eXxhQzHJtZ+fu4pITcJ1juT38DMiajtMn r4Hs/HtaTs/YqLKcQH02A8YuUtjvLy+EdDp69iplWyu0/uzeKiTTTaPkU cYfsGoj9mW2lgJfjKQusGQGb9sg0YxzW0P1yrUYOuAfgR6pDC11DeGM1C Pu2NYC4eVk1Tz9DzQrXyRDjvtpAMS2AviskpvhZuBNIHBmrWr39r6AUcm wf7ClOL+caqbVSxhr5xLKJzq7MAW4llfLvqb3e2s7SKRKSZrNdT/dNHYv USFuCfxxp+XsyFXpJi2GgqgR5KXS2Zyp9keIfl4g2g5S73I4bTFOoGYWh Q==; X-IronPort-AV: E=Sophos;i="5.93,196,1654531200"; d="scan'208";a="311434666" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Jul 2022 11:14:50 +0800 IronPort-SDR: +MK7/fWtrzZuAVBFXoWcZzaGa4myIojz91Xe8EbtBMV5M+KTHM5A//HWnj+pX08yUraxQLMmU4 sckY/JlcNGSNVEj2L29uj3OP82WddiXfBCviv9n8VEJFD/L0ss6K/Sa0o4vJAQe3J9HPEMxVHd EW6C2KbO8paP4kjicASV8H7iQApuMOMj2gYXCEQKIspwY3SuTPhcUn0iJaBopXCTWRlTJXbwGU wneIatmtqtHi980msPM1DvnYzE0Z2Dc+/ZNZBPDZQuo0waUejByEcffk4ZB70opNjmLMjZ7G0R KOWsv6CgCQ0dXEJCWCmmYTmJ Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 27 Jul 2022 19:36:01 -0700 IronPort-SDR: 2wxXsFhXikYCLTBewxBEvND7HUnYJe3NPUFawTG/yLn11zup3s5jL4L6RwuL/SXSaeSXHAnjZD lh0KxJS1FHQcliXAKgV3jEJrCpbNFFQtBXb2nx0uil/+BWgttG94GlETA5AsT+Sl5mDkertuoI YAjO7m36pjCtzqJUBb5UVCI8gTR4sWCVMNuJVRMj5YFBerjhwqNjLV87dDtS9hTULtsba2Hscu 1gTUI7JTuCO745Y970Nzbn8Yfp2HeoV4xy0NYX6YlEChJebDCMnu37WPI5sGVCQ2uUjJhMnUIG 5mg= WDCIronportException: Internal Received: from usg-ed-osssrv.wdc.com ([10.3.10.180]) by uls-op-cesaip02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 27 Jul 2022 20:14:51 -0700 Received: from usg-ed-osssrv.wdc.com (usg-ed-osssrv.wdc.com [127.0.0.1]) by usg-ed-osssrv.wdc.com (Postfix) with ESMTP id 4LtbNp10HLz1Rwry for ; Wed, 27 Jul 2022 20:14:50 -0700 (PDT) Authentication-Results: usg-ed-osssrv.wdc.com (amavisd-new); dkim=pass reason="pass (just generated, assumed good)" header.d=opensource.wdc.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d= opensource.wdc.com; h=content-transfer-encoding:content-type :in-reply-to:organization:from:references:to:content-language :subject:user-agent:mime-version:date:message-id; s=dkim; t= 1658978088; x=1661570089; bh=V9Bt9U8xCLdifnqBhnRTi+jyLLn32+9OAJo JPTVsyn0=; b=muqLFW9oCZrGmwA88CMAX3StJrDnBNGnuHz8ILJFLXO3hUFE+dF fSLo4ZmR6g0gLggQqVwLkom7EgnXX0ejzpN2TDO1xlfHCGq5Xj67O/j/NbeoZnGJ 1NW5eFMb8ywhQ7/pxw2w7BIxuLzAyqOrxH75UxsfEc/b7yY4lC4PS7mBFI++fVxz LWFury831jls0EiqkzO2YZlDx/bDloxhU+jVQHAemBXwysfymHlfojwgeTN7WeEo jnVaWoYaElOPqzYIb/e9jdawqh7YYxvWSKMhDlMMzqGnIyUroFheQgVl8PLzMf5+ hfu+j/R/IdQv/gcOryR0tLq2sh2j+6CLwbg== X-Virus-Scanned: amavisd-new at usg-ed-osssrv.wdc.com Received: from usg-ed-osssrv.wdc.com ([127.0.0.1]) by usg-ed-osssrv.wdc.com (usg-ed-osssrv.wdc.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Z_277ewO0oJl for ; Wed, 27 Jul 2022 20:14:48 -0700 (PDT) Received: from [10.225.163.14] (unknown [10.225.163.14]) by usg-ed-osssrv.wdc.com (Postfix) with ESMTPSA id 4LtbNj19KLz1RtVk; Wed, 27 Jul 2022 20:14:44 -0700 (PDT) Message-ID: Date: Thu, 28 Jul 2022 12:14:43 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH v8 05/11] null_blk: allow non power of 2 zoned devices Content-Language: en-US To: Pankaj Raghav , hch@lst.de, axboe@kernel.dk, snitzer@kernel.org, Johannes.Thumshirn@wdc.com Cc: matias.bjorling@wdc.com, gost.dev@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, linux-block@vger.kernel.org, pankydev8@gmail.com, bvanassche@acm.org, jaegeuk@kernel.org, dm-devel@redhat.com, linux-nvme@lists.infradead.org, Luis Chamberlain References: <20220727162245.209794-1-p.raghav@samsung.com> <20220727162245.209794-6-p.raghav@samsung.com> From: Damien Le Moal Organization: Western Digital Research In-Reply-To: <20220727162245.209794-6-p.raghav@samsung.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220727_201456_392037_EDCF388F X-CRM114-Status: GOOD ( 29.83 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On 7/28/22 01:22, Pankaj Raghav wrote: > Convert the power of 2 based calculation with zone size to be generic in > null_zone_no with optimization for power of 2 based zone sizes. > > The nr_zones calculation in null_init_zoned_dev has been replaced with a > division without special handling for power of 2 based zone sizes as > this function is called only during the initialization and will not > invoked in the hot path. > > Performance Measurement: > > Device: > zone size = 128M, blocksize=4k > > FIO cmd: > > fio --name=zbc --filename=/dev/nullb0 --direct=1 --zonemode=zbd --size=23G > --io_size= --ioengine=io_uring --iodepth= --rw= --bs=4k > --loops=4 > > The following results are an average of 4 runs on AMD Ryzen 5 5600X with > 32GB of RAM: > > Sequential Write: > > x-----------------x---------------------------------x---------------------------------x > | IOdepth | 8 | 16 | > x-----------------x---------------------------------x---------------------------------x > | | KIOPS |BW(MiB/s) | Lat(usec) | KIOPS |BW(MiB/s) | Lat(usec) | > x-----------------x---------------------------------x---------------------------------x > | Without patch | 578 | 2257 | 12.80 | 576 | 2248 | 25.78 | > x-----------------x---------------------------------x---------------------------------x > | With patch | 581 | 2268 | 12.74 | 576 | 2248 | 25.85 | > x-----------------x---------------------------------x---------------------------------x > > Sequential read: > > x-----------------x---------------------------------x---------------------------------x > | IOdepth | 8 | 16 | > x-----------------x---------------------------------x---------------------------------x > | | KIOPS |BW(MiB/s) | Lat(usec) | KIOPS |BW(MiB/s) | Lat(usec) | > x-----------------x---------------------------------x---------------------------------x > | Without patch | 667 | 2605 | 11.79 | 675 | 2637 | 23.49 | > x-----------------x---------------------------------x---------------------------------x > | With patch | 667 | 2605 | 11.79 | 675 | 2638 | 23.48 | > x-----------------x---------------------------------x---------------------------------x > > Random read: > > x-----------------x---------------------------------x---------------------------------x > | IOdepth | 8 | 16 | > x-----------------x---------------------------------x---------------------------------x > | | KIOPS |BW(MiB/s) | Lat(usec) | KIOPS |BW(MiB/s) | Lat(usec) | > x-----------------x---------------------------------x---------------------------------x > | Without patch | 522 | 2038 | 15.05 | 514 | 2006 | 30.87 | > x-----------------x---------------------------------x---------------------------------x > | With patch | 522 | 2039 | 15.04 | 523 | 2042 | 30.33 | > x-----------------x---------------------------------x---------------------------------x > > Minor variations are noticed in Sequential write with io depth 8 and > in random read with io depth 16. But overall no noticeable differences > were noticed > > Reviewed-by: Luis Chamberlain > Reviewed by: Adam Manzanares > Reviewed-by: Hannes Reinecke > Signed-off-by: Pankaj Raghav > --- > drivers/block/null_blk/main.c | 5 ++--- > drivers/block/null_blk/null_blk.h | 6 ++++++ > drivers/block/null_blk/zoned.c | 18 +++++++++++------- > 3 files changed, 19 insertions(+), 10 deletions(-) > > diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c > index c451c477978f..f1e0605dee94 100644 > --- a/drivers/block/null_blk/main.c > +++ b/drivers/block/null_blk/main.c > @@ -1976,9 +1976,8 @@ static int null_validate_conf(struct nullb_device *dev) > if (dev->queue_mode == NULL_Q_BIO) > dev->mbps = 0; > > - if (dev->zoned && > - (!dev->zone_size || !is_power_of_2(dev->zone_size))) { > - pr_err("zone_size must be power-of-two\n"); > + if (dev->zoned && !dev->zone_size) { > + pr_err("Invalid zero zone size\n"); > return -EINVAL; > } > > diff --git a/drivers/block/null_blk/null_blk.h b/drivers/block/null_blk/null_blk.h > index 94ff68052b1e..ece6dded9508 100644 > --- a/drivers/block/null_blk/null_blk.h > +++ b/drivers/block/null_blk/null_blk.h > @@ -83,6 +83,12 @@ struct nullb_device { > unsigned int imp_close_zone_no; > struct nullb_zone *zones; > sector_t zone_size_sects; > + /* > + * zone_size_sects_shift is only useful when the zone size is > + * power of 2. This variable is set to zero when zone size is non > + * power of 2. > + */ Simplify: /* * zone_size_sects_shift is used only when the zone size is a * power of 2 number of sectors. */ But personally, I would simply drop this comment. The name is obvious and the code very clear. > + unsigned int zone_size_sects_shift; > bool need_zone_res_mgmt; > spinlock_t zone_res_lock; > > diff --git a/drivers/block/null_blk/zoned.c b/drivers/block/null_blk/zoned.c > index 55a69e48ef8b..015f6823706c 100644 > --- a/drivers/block/null_blk/zoned.c > +++ b/drivers/block/null_blk/zoned.c > @@ -16,7 +16,10 @@ static inline sector_t mb_to_sects(unsigned long mb) > > static inline unsigned int null_zone_no(struct nullb_device *dev, sector_t sect) > { > - return sect >> ilog2(dev->zone_size_sects); > + if (dev->zone_size_sects_shift) > + return sect >> dev->zone_size_sects_shift; > + > + return div64_u64(sect, dev->zone_size_sects); > } > > static inline void null_lock_zone_res(struct nullb_device *dev) > @@ -65,10 +68,6 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q) > sector_t sector = 0; > unsigned int i; > > - if (!is_power_of_2(dev->zone_size)) { > - pr_err("zone_size must be power-of-two\n"); > - return -EINVAL; > - } > if (dev->zone_size > dev->size) { > pr_err("Zone size larger than device capacity\n"); > return -EINVAL; > @@ -86,9 +85,14 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q) > zone_capacity_sects = mb_to_sects(dev->zone_capacity); > dev_capacity_sects = mb_to_sects(dev->size); > dev->zone_size_sects = mb_to_sects(dev->zone_size); > - dev->nr_zones = round_up(dev_capacity_sects, dev->zone_size_sects) > - >> ilog2(dev->zone_size_sects); > > + if (is_power_of_2(dev->zone_size_sects)) > + dev->zone_size_sects_shift = ilog2(dev->zone_size_sects); > + else > + dev->zone_size_sects_shift = 0; > + > + dev->nr_zones = DIV_ROUND_UP_SECTOR_T(dev_capacity_sects, > + dev->zone_size_sects); > dev->zones = kvmalloc_array(dev->nr_zones, sizeof(struct nullb_zone), > GFP_KERNEL | __GFP_ZERO); > if (!dev->zones) -- Damien Le Moal Western Digital Research